{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Walkthough of Vamb from the Python interpreter\n",
    "\n",
    "The Vamb pipeline consist of a series of tasks each which have a dedicated module:\n",
    "\n",
    "1) Parse fasta file and get TNF of each sequence, as well as sequence length and names (module `parsecontigs`)\n",
    "\n",
    "2) Parse the BAM files and get abundance estimate for each sequence in the fasta file (module `parsebam`)\n",
    "\n",
    "3) Train a VAE with the abundance and TNF matrices, and encode it to a latent representation (module `encode`)\n",
    "\n",
    "4) Cluster the encoded inputs to metagenomic bins (module `cluster`)\n",
    "\n",
    "Additionally, for developing and testing Vamb, we use:\n",
    "\n",
    "5) Benchmark the resulting bins against a gold standard (module `benchmark`)\n",
    "\n",
    "In the following chapters of this walkthrough, we will go through each step in more detail from within the Python interpreter. We will show how to use Vamb by example, what each step does, some of the theory behind the actions, and the different parameters that can be set. With this knowledge, you should be able to extend or alter the behaviour of Vamb more easily.\n",
    "\n",
    "For the examples, we will assume the following relevant prerequisite files exists in the directory `/Users/jakobnissen/Downloads/example/`:\n",
    "\n",
    "* `contigs.fna.gz` - The filtered FASTA contigs which were mapped against, and\n",
    "* `bam/*.bam` - 9 BAM files from the mapping.\n",
    "\n",
    "## Table of contents:\n",
    "\n",
    "### 1. [Introduction to metagenomic binning and best practices](#introduction)\n",
    "\n",
    "### 2. [Importing Vamb and getting help](#importing)\n",
    "\n",
    "### 3. [Calculate the sequence tetranucleotide frequencies](#parsecontigs)\n",
    "\n",
    "### 4. [Calculate the abundance matrix](#parsebam)\n",
    "\n",
    "### 5. [Train the autoencoder and encode input data](#encode)\n",
    "\n",
    "### 6. [Binning the encoding](#cluster)\n",
    "\n",
    "### 7. [Postprocessing the bins](#postprocessing)\n",
    "\n",
    "### 8. [Summary of full workflow](#summary)\n",
    "\n",
    "### 9. [Running VAMB with low memory (RAM)](#memory)\n",
    "\n",
    "### 10. [Optional: Benchmarking your bins](#benchmark)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"introduction\"></a>\n",
    "## Introduction to metagenomic binning and best practices\n",
    "\n",
    "When we began working on the Vamb project, we had been doing reference-free metagenomics for years using MetaBAT2, and thought we had a pretty good idea of how metagenomic binning worked and what to watch out for. Turns out, there's more to it than we had thought about. During this project, our group have revised the way we use binners multiple times as we've learned the dos and don'ts. In this introduction I will cover some facts of binning that is useful for the end user.\n",
    "\n",
    "__1. Binning is a hard problem__\n",
    "\n",
    "Various binning papers (including ours), typically brag about how many high-quality genomes they've been able to reconstruct from data. However, examining the actual number of genomes reconstructed compared to the estimated diversity of the samples can be sobering: Only a very small fraction of the organisms are binned decently. In fact, only a small fraction of the *sequenced* organisms are binned decently!\n",
    "\n",
    "Information is lost at every step along the way: In the sequencing itself, during assembly, when estimating contig abundance, and during the binning. For now, this is just a fact to accept, but we hope it will improve in the future, especially with better long-read sequencing technologies.\n",
    "\n",
    "__2. There is a tradeoff in what contigs to discard__\n",
    "\n",
    "Typically, the inputs to the binners are sequence composition and/or sequence abundance. Both these metrics are noisy for smaller contigs, as the kmer composition is not stable for small contigs, and the estimated depth across a genome varies wildly. This causes small contigs to be hard to handle for binners, as the noise swamps the signal.\n",
    "\n",
    "Worse, the presence of hard-to-bin contigs may adversely affect the binning of easy-to-bin contigs depending on the clustering algorithm used. One solution is to discard small contigs, but here lies a tradeoff: When are you just throwing away good data? This is especially bitter when the large majority of metagenomic assemblies usually lies in short contigs. With current (2019) assembly and mapping technology, we are usign a threshold of around 2000 bp.\n",
    "\n",
    "__3. Garbage in, garbage out__\n",
    "\n",
    "Binners like MaxBin, MetaBAT2 and Vamb uses kmer frequencies and estimated depths to bin sequences. While the kmer frequency is easy to calculate (but sensitive to small contigs - see above), this is not the case for depth. Depths are estimated by mapping reads back to the contigs and counting where they map to. Here especially, it's worth thinking about exactly how you map:\n",
    "\n",
    "* In general, if reads are allowed to map to subjects with a nucleotide identity of X %, then it is not possible to distinguish genomes with a higher nucleotide identity than X % using co-abundance, and these genomes will be binned together. This means you want to tweak your alignment tool to only output alignments with the desired minimum query/subject identity - for example 97%.\n",
    "\n",
    "* The MAPQ field of a BAM/SAM file is the probability that the mapping position is correct. Typically, aligners outputs a low MAPQ score if a read can map to multiple sequences. However, because we expect to have multiple similar sequences in metagenomics (e.g. the same strain from different samples), filtering low MAPQ-alignments away using e.g. Samtools will give bad results.\n",
    "\n",
    "So when creating your BAM files, make sure to think of the above. Furthermore, there are some issues to think about when estimating depths:\n",
    "\n",
    "* If a read maps well to N references, the read needs to count 1/N towards the depth of each of those references. Not counting secondary alignments is misleading (see above), and having it count 1 towards each reference will cause the depth of conserved genomic regions to be overestimated, which can affect binning.\n",
    "\n",
    "* You need to consider how to count a properly mapped read pair versus a pair with each mate mapping to distinct references. Some aligners e.g. BWA MEM will assign reads seemingly independent of their mates, in which case each mate should count independently towards the depth.\n",
    "\n",
    "The BAM parsing module in Vamb takes care of these two latter points.\n",
    "\n",
    "__4. It's hard to figure out if a binner is doing well__\n",
    "\n",
    "What makes a good binner? This question quickly becomes subjective or context dependent:\n",
    "\n",
    "* What's the ideal precision/recall tradeoff? One researcher might prefer pure but fragmented genomes, whereas another might want complete genomes, and care less about the purity.\n",
    "\n",
    "* What's the ideal cutoff in bin quality where the binner should not output the bin? Is it better to only produce good bins, or produce all bins, including the dubious ones?\n",
    "\n",
    "* Which taxonomic level is the right level to bin at? If nearly identical genomes from different samples go in different bins, is that a failure of the binner, or successfully identitfying microdiversity? What about splitting up strains? Species? Since the concept of a bacterial species is arbitrary, should the binner ideally group along nucleotide identity levels, regardless of the taxonomic annotation?\n",
    "\n",
    "* Are plasmids, prophages etc. considered a necessary part of a bin? Should a binner strive to bin plasmids with their host? What if it comes at a cost of genomic recall or precision?\n",
    "\n",
    "* If a genome is split equally into three bins, with the genome being a minority in each bin does that mean the genome is present at a recall of 0%, because each bin by majority vote will be assigned to another genome, a recall of 33%, because that's the maximal recall in any particular bin, or at 100%, because the whole genome is present in the three bins?\n",
    "\n",
    "* Should contigs be allowed to be present in multiple bins? If so, any ambiguity in binning can be trivially resolved by producing multiple bins where there is ambiguity (in extrema: Outputting the powerset of input contigs) - but surely, this just pushes the problems to the next point in the pipeline. If not, then conserved genomic regions, where reads from multiple species can be assembled into a single contig, cannot be binned in the multiple bins where it is actually present.\n",
    "\n",
    "* What postprocessing can you assume people are doing after the binning? Given e.g. a bin deduplication postprocessing step, creating multiple similar bins does not matter so much. However, if no postprocessing is done, doing so will give misleading results. \n",
    "\n",
    "Unfortunately, all these choices have a significant impact on how you asses performance of a binner - and thus, *how you create a binner in the first place*. That means, given different definitions of binning quality, different binners might to worse or better.\n",
    "\n",
    "__5. In summary__\n",
    "\n",
    "Here's the workflow we are using now:\n",
    "\n",
    "* Assemble your bins one sample at a time using a dedicated metagenomic assembler. We recommend metaSPAdes.\n",
    "* Concatenate the contigs/scaffolds to a single FASTA file, making sure that your FASTA headers are all unique.\n",
    "* Remove contigs < 2000 bp from the FASTA file.\n",
    "* Map reads from each sample to the FASTA file. Make sure to set the minimum accepted mapping identity threshold to be similar to the identity threshold with which you want to bin. Do not sort the BAM files, or if they are already sorted, sort again by read name. Do not filter for poor MAPQ. Output all secondary alignments.\n",
    "* Run Vamb with default parameters, unless you really know what you are doing.\n",
    "* After binning, split your bins according to the sample they originated from. In this way, you can bin using co-abundance across samples, while still seeing microdiversity from sample to sample."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"importing\"></a>\n",
    "## Importing Vamb and getting help\n",
    "\n",
    "First step is to get Vamb imported. If you installed with `pip`, it should be directly importable. Else, you might have to add the path of Vamb to `sys.path`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import vamb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When using Vamb, you'll almost certianly need help (we wish it was so easy you didn't, but making user friendly software is *hard!*).\n",
    "\n",
    "Luckily, there's the built-in `help` function in Python."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on package vamb:\n",
      "\n",
      "NAME\n",
      "    vamb\n",
      "\n",
      "DESCRIPTION\n",
      "    Vamb - Variational Autoencoders for Metagenomic Binning\n",
      "    Documentation: https://github.com/RasmussenLab/vamb/\n",
      "    \n",
      "    Vamb contains the following modules:\n",
      "    vamb.vambtools\n",
      "    vamb.parsecontigs\n",
      "    vamb.parsebam\n",
      "    vamb.encode\n",
      "    vamb.cluster\n",
      "    vamb.benchmark\n",
      "    \n",
      "    General workflow:\n",
      "    1) Filter contigs by size using vamb.vambtools.filtercontigs\n",
      "    2) Map reads to contigs to obtain BAM file\n",
      "    3) Calculate TNF of contigs using vamb.parsecontigs\n",
      "    4) Create RPKM table from BAM files using vamb.parsebam\n",
      "    5) Train autoencoder using vamb.encode\n",
      "    6) Cluster latent representation using vamb.cluster\n",
      "    7) Split bins using vamb.vambtools\n",
      "\n",
      "PACKAGE CONTENTS\n",
      "    __main__\n",
      "    _vambtools\n",
      "    benchmark\n",
      "    cluster\n",
      "    encode\n",
      "    parsebam\n",
      "    parsecontigs\n",
      "    vambtools\n",
      "\n",
      "DATA\n",
      "    __authors__ = ('Jakob Nybo Nissen', 'Simon Rasmussen')\n",
      "    __licence__ = 'MIT'\n",
      "\n",
      "VERSION\n",
      "    (3, 0, 1)\n",
      "\n",
      "FILE\n",
      "    /Users/jakobnissen/Documents/scripts/vamb/vamb/__init__.py\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(vamb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also get help for each of the modules, for example the `cluster` module:\n",
    "\n",
    "`>>> help(vamb.cluster)`\n",
    "\n",
    "    Help on module vamb.cluster in vamb:\n",
    "\n",
    "    NAME\n",
    "        vamb.cluster - Iterative medoid clustering.\n",
    "\n",
    "    DESCRIPTION\n",
    "        Usage:\n",
    "        >>> clusters = list(ClusterIterator(matrix))\n",
    "\n",
    "        Implements one core function, cluster, along with the helper\n",
    "        functions write_clusters and read_clusters.\n",
    "\n",
    "    CLASSES\n",
    "    \n",
    "    [ lines elided ]\n",
    "        \n",
    "---\n",
    "And for functions:\n",
    "\n",
    "`>>> help(vamb.cluster.cluster)`\n",
    "\n",
    "    Help on function cluster in module vamb.cluster:\n",
    "\n",
    "    cluster(matrix, maxsteps=25, windowsize=200, minsuccesses=20, destroy=False, normalized=False, cuda=False)\n",
    "        Legacy constructor for ClusterGenerator object. Deprecated - just use\n",
    "        ClusterGenerator instead.\n",
    "\n",
    "        Inputs:\n",
    "            matrix: A (obs x features) Numpy matrix of data type numpy.float32\n",
    "            maxsteps: Stop searching for optimal medoid after N futile attempts [25]\n",
    "            windowsize: Length of window to count successes [200]\n",
    "            minsuccesses: Minimum acceptable number of successes [15]\n",
    "            destroy: Save memory by destroying matrix while clustering [False]\n",
    "            normalized: Matrix is already preprocessed [False]\n",
    "\n",
    "        Output: Same as ClusterGenerator(args*, kwargs**)\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"parsecontigs\"></a>\n",
    "## Calculate the sequence tetranucleotide frequencies\n",
    "\n",
    "If you forget what to do at each step, remember that `help(vamb)` said:\n",
    "\n",
    "    General workflow:\n",
    "    1) Filter contigs by size using vamb.vambtools.filtercontigs\n",
    "    2) Map reads to contigs to obtain BAM file\n",
    "    3) Calculate TNF of contigs using vamb.parsecontigs\n",
    "    \n",
    "    [ lines elided ]\n",
    "\n",
    "Okay, we already have filtered contigs. I could have used the `vamb.vambtools.filtercontigs` to filter the FASTA file, but here, they were already filtered. We have also already mapped reads to them and gotten BAM files, so we begin with the third step, using the `vamb.parsecontigs` module. How do you use that?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on module vamb.parsecontigs in vamb:\n",
      "\n",
      "NAME\n",
      "    vamb.parsecontigs - Calculate tetranucleotide frequency from a FASTA file.\n",
      "\n",
      "DESCRIPTION\n",
      "    Usage:\n",
      "    >>> with open('/path/to/contigs.fna', 'rb') as filehandle\n",
      "    ...     tnfs, contignames, lengths = read_contigs(filehandle)\n",
      "\n",
      "FUNCTIONS\n",
      "    read_contigs(filehandle, minlength=100)\n",
      "        Parses a FASTA file open in binary reading mode.\n",
      "        \n",
      "        Input:\n",
      "            filehandle: Filehandle open in binary mode of a FASTA file\n",
      "            minlength: Ignore any references shorter than N bases [100]\n",
      "        \n",
      "        Outputs:\n",
      "            tnfs: An (n_FASTA_entries x 103) matrix of tetranucleotide freq.\n",
      "            contignames: A list of contig headers\n",
      "            lengths: A Numpy array of contig lengths\n",
      "\n",
      "FILE\n",
      "    /Users/jakobnissen/Documents/scripts/vamb/vamb/parsecontigs.py\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(vamb.parsecontigs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "I use `vamb.parsecontigs.read_contigs` with the inputs and outputs as written:\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = '/Users/jakobnissen/Downloads/example/contigs.fna.gz'\n",
    "\n",
    "# Use Reader to open plain or zipped files. File must be opened in binary mode\n",
    "with vamb.vambtools.Reader(path, 'rb') as filehandle:\n",
    "    tnfs, contignames, lengths = vamb.parsecontigs.read_contigs(filehandle)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "Let's have a look at the resulting data\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Type of tnfs: <class 'numpy.ndarray'> of dtype float32\n",
      "Shape of tnfs: (57762, 103)\n",
      "\n",
      "Type of contignames: <class 'list'>\n",
      "Length of contignames: 57762\n",
      "\n",
      "First 5 elements of contignames:\n",
      "S5C2\n",
      "S5C11\n",
      "S5C30\n",
      "S5C47\n",
      "S5C53\n",
      "\n",
      "Type of lengths: <class 'numpy.ndarray'> of dtype int64\n",
      "Length of lengths: 57762\n",
      "\n",
      "First 5 elements of lengths:\n",
      "131131\n",
      "58724\n",
      "121618\n",
      "2179\n",
      "2698\n"
     ]
    }
   ],
   "source": [
    "print('Type of tnfs:', type(tnfs), 'of dtype', tnfs.dtype)\n",
    "print('Shape of tnfs:', tnfs.shape, end='\\n\\n')\n",
    "\n",
    "print('Type of contignames:', type(contignames))\n",
    "print('Length of contignames:', len(contignames), end='\\n\\n')\n",
    "\n",
    "print('First 5 elements of contignames:')\n",
    "for i in range(5):\n",
    "    print(contignames[i])\n",
    "\n",
    "print('\\nType of lengths:', type(lengths), 'of dtype', lengths.dtype)\n",
    "print('Length of lengths:', len(lengths), end='\\n\\n')\n",
    "\n",
    "print('First 5 elements of lengths:')\n",
    "for i in range(5):\n",
    "    print(lengths[i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "__For a zipped FASTA file__, simply use `with open vamb.vambtools.Reader('/path/to/contigs.fna' ,'rb')`. The `Reader` automatically and transparently reads plaintext, gzipped, bzip2 and .xz files, and return an opened file object. You can also pass an ordinary file handle (or indeed any iterable of lines as `bytes` objects) to `vamb.parsecontigs.read_contigs`.\n",
    "\n",
    "Note that reading zipped files will slow down the FASTA parsing quite a bit. But the time spent parsing the FASTA file will likely still be insignificant compared to the other steps of Vamb.\n",
    "\n",
    "__The rationale for parsing the contigs__ is that it turns out that related organisms tend to share a similar kmer-distribution across most of their genome. The reason for that is not understood, even though it's believed that common functional motifs, GC-content and presence/absence of endonucleases explains some of the observed similary.\n",
    "\n",
    "The `tnfs` is the tetranucleotide frequency - it's the frequency of the canonical kmer of each 4mer in the contig. We use 4-mers because there are 256 4-mers, which is an appropriate number of features to cluster - not so few that there's no signal and not so many it becomes unwieldy and the estimates of the frequencies become uncertain. We could also have used 3-mers. In tests we have made, 3-mers are *almost*, but not quite as good as 4-mers for separating different species. You could probably switch tetranucleotide frequency to trinucleotide frequency in Vamb without any significant drop of accuracy. However, there are 1024 5-mers, that would be too many features to handle comfortably, and it could easily cause memory issues.\n",
    "\n",
    "We do not work with tetranucleotide frequencies (TNF) directly. TNFs are highly correlated, for example, we would expect the frequency of `AAGA` to be very similar to `AGAA`. Some of these correlations are due to e.g. GC content which is a valid signal. But some of these correlations are due to interdependencies between the TNFs which does NOT constitute a signal and is pure redundancy in the data. Removing this saves RAM, computing resources, and may also make the VAE converge better. There are three such dependencies:\n",
    "\n",
    "* Being frequencies, they must sum to 1.\n",
    "* The frequency of a kmer must be the same as the reverse complement (RC), because the RC is simply present on the other strand. We only observe one strand, which one is arbitrary. So we average between kmers and RC. For example, if we see 31 AAGA and 24 TCTT, we will note (31+24)/2 = 27.5 of each instead.\n",
    "* Every time we observe a kmer with the bases ABCD, we *must* also observe the immediately following kmer, BCDE. Hence, for every 3-mer XYZ, it is true that (AXYZ + CXYZ + GXYZ + TXYZ) = (XYZA + XYZC + XYZG + XYZT). This is not always true for finite contigs because the 4mer at the end has no \"next\" 4mer. But that fact is a tiny measurement error in the first place, so we do not care about that.\n",
    "\n",
    "Based on these constraints, we can create a set of linear equations, and then find the kernel matrix L which projects any matrix of 4mer frequencies T to a matrix P with fewer features (103 in this case):\n",
    "\n",
    "$$\\textbf{P} = \\textbf{TL}$$\n",
    "\n",
    "We enforce constraint 2 by tallying the mean of each kmer and its frequency. This reverse complement averaging can be done using a matrix multiply on the raw tetranucleotide frequencies F with a reverse-complementing matrix R:\n",
    "\n",
    "$$\\textbf{T} = \\textbf{FR}$$\n",
    "\n",
    "And since matrix multiplication is associative, we can create a single kernel K which does both the reverse complementation averaging and projects down to 103 features in one matrix multiply:\n",
    "\n",
    "$$\\textbf{P} = \\textbf{TL} = (\\textbf{FR})\\textbf{L} = \\textbf{F}(\\textbf{RL}) = \\textbf{FK}$$\n",
    "\n",
    "The kernel K is calculated in `src/create_kernel.py`, but comes pre-calculated with Vamb. See the method section of [Kislyuk et al. 2009](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2765972/) for more details.\n",
    "\n",
    "__The argument `minlength`__ sets the filter removing any contigs shorter than this. We don't know exactly what the sweet spot is, but it's probably somewhere around ~2000 bp.\n",
    "\n",
    "The problem with filtering contigs using `minlength` is that the smaller contigs which are thrown away will still recruit reads during the mapping that creates the BAM files, thus removing information from those reads. For that reason, we recommend filtering the contigs *before* mapping.\n",
    "\n",
    "__The memory consumption of Vamb can be an issue__, so at this point, you should probably consider whether you have enough RAM. This is a small dataset, so there's no problem. With hundreds of samples and millions of contigs however, this becomes a problem, even though Vamb is fairly memory-friendly. If you think memory might be an issue, see the [Running VAMB with low memory (RAM)](#memory) section.\n",
    "\n",
    "<a id=\"parsebam\"></a>\n",
    "## Calculate the abundance matrix\n",
    "\n",
    "Again, we can use the help function to see what we need to do\n",
    "    \n",
    "`>>> help(vamb.parsebam)`\n",
    "\n",
    "    Help on module vamb.parsebam in vamb:\n",
    "\n",
    "    NAME\n",
    "        vamb.parsebam - Estimate RPKM (depths) from BAM files of reads mapped to contigs.\n",
    "\n",
    "    DESCRIPTION\n",
    "        Usage:\n",
    "        >>> bampaths = ['/path/to/bam1.bam', '/path/to/bam2.bam', '/path/to/bam3.bam']\n",
    "        >>> rpkms = read_bamfiles(bampaths)\n",
    "\n",
    "    FUNCTIONS\n",
    "    \n",
    "    [ lines elided ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['/Users/jakobnissen/Downloads/example/bam/0.subset.bam',\n",
       " '/Users/jakobnissen/Downloads/example/bam/2.subset.bam',\n",
       " '/Users/jakobnissen/Downloads/example/bam/21.subset.bam',\n",
       " '/Users/jakobnissen/Downloads/example/bam/22.subset.bam',\n",
       " '/Users/jakobnissen/Downloads/example/bam/24.subset.bam',\n",
       " '/Users/jakobnissen/Downloads/example/bam/25.subset.bam',\n",
       " '/Users/jakobnissen/Downloads/example/bam/3.subset.bam',\n",
       " '/Users/jakobnissen/Downloads/example/bam/5.subset.bam',\n",
       " '/Users/jakobnissen/Downloads/example/bam/6.subset.bam']"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let's do it:\n",
    "\n",
    "bamfiles = !ls /Users/jakobnissen/Downloads/example/bam\n",
    "bamfiles = ['/Users/jakobnissen/Downloads/example/bam/' + p for p in bamfiles]\n",
    "bamfiles"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Type of rpkms: <class 'numpy.ndarray'> of dtype float32\n",
      "Shape of rpkms (57762, 9)\n"
     ]
    }
   ],
   "source": [
    "# Yep, those file paths look right. This step takes some time for large files.\n",
    "rpkms = vamb.parsebam.read_bamfiles(bamfiles) \n",
    "print('Type of rpkms:', type(rpkms), 'of dtype', rpkms.dtype)\n",
    "print('Shape of rpkms', rpkms.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "The idea here is that two contigs from the same genome will always be physically present together, and so they should have a similar abundance in all samples. Some contigs represent repeats like duplicated segments - these contigs should have a fixed ratio of abundance to other contigs. Thus, even when considering repeated contigs, there should be a tight cosine distance correlation between abundances of contigs from the same genome.\n",
    "\n",
    "The `vamb.parsebam` module takes a rather crude approach to estimating abundance, namely by simply counting the number of mapped reads to each contig, normalized by total number of reads and the contig's length. This measure is in trancriptomics often called RPKM, *reads per kilobase per million mapped reads*. Other metagenomic binners like Metabat and Canopy uses an average of per-nucleotide depth of coverage instead. We do not believe there is any theoretical or practical advantage of using depth over RPKM. We will use the terms *abundance*,  *depth* and *RPKM* interchangably.\n",
    "\n",
    "---\n",
    "`>>> help(vamb.parsebam.read_bamfiles)`\n",
    "\n",
    "    Help on function read_bamfiles in module vamb.parsebam:\n",
    "\n",
    "    read_bamfiles(paths, dumpdirectory=None, refhash=None, minscore=None, minlength=None, minid=None, subprocesses=8, logfile=None)\n",
    "        Spawns processes to parse BAM files and get contig rpkms.\n",
    "\n",
    "        Input:\n",
    "            path: List or tuple of paths to BAM files\n",
    "            dumpdirectory: [None] Dir to create and dump per-sample depths NPZ files to\n",
    "            refhash: [None]: Check all BAM references md5-hash to this (None = no check)\n",
    "            minscore [None]: Minimum alignment score (AS field) to consider\n",
    "            minlength [None]: Ignore any references shorter than N bases\n",
    "            minid [None]: Discard any reads with nucleotide identity less than this\n",
    "            subprocesses [8]: Number of subprocesses to spawn\n",
    "            logfile: [None] File to print progress to\n",
    "\n",
    "        Output: A (n_contigs x n_samples) Numpy array with RPKM\n",
    "\n",
    "We can see (in the default value for the `subprocesses` argument to `read_bamfiles`) that the default number of parallel BAM-reading processes it will spawn is 8. This is because Python detected 8 threads on my laptop. In general, Vamb's default here is to use the number of availble threads, or 8 threads if more than 8 is detected, as the BAM-reading function will almost certainly become IO bound at more than 8 threads.\n",
    "\n",
    "As with the `vamb.parsecontigs.read_contigs` function, I don't care about the `minlength` argument, since our fasta file is already filtered. Again, I will re-iterate that filtering the FASTA file _before_ mapping leads to the best results.\n",
    "\n",
    "The function ignores all alignments with alignment score less than `minscore` (as determined by the auxiliary `AS:i` field in the BAM file, which Vamb assumes is present if `minscore` is not `None`). Ideally, the user should construct the BAM files so that they only contain alignments that the user believes are true (i.e. set reasonable alignment and filtering criteria).\n",
    "\n",
    "Similarly to `minscore`, this function has a `minid` option. The parser will ignore any alignments where the nucleotide identity fraction (i.e. in [0;1)) is smaller than `minid` across the entire alignment.\n",
    "\n",
    "The argument `refhash` is used to verify that the headers of the FASTA file given to VAMB are identical and in the same order as that of the BAM files. If this is `None`, no check is done. Else, this should be a length 16 `bytes` object, representing the md5 hash of the reference sequences in the FASTA file as calculated by `vamb.parsebam._hash_refnames`. If the hash of the references in any BAM file does not match his hash, Vamb raises an exception. Setting the flag `--norefcheck` when using Vamb on command-line sets `refhash` to `None` .\n",
    "\n",
    "Lastly, the argument `logfile` should be `None` or the filehandle of an opened, writeable file. If the latter, it will print status updates to the logfile.\n",
    "\n",
    "---\n",
    "Now, I tend to be a bit ~~paranoid~~<sup>careful</sup>, so if I loaded in 500 GB of BAM files, I'd want to save the work I have now in case something goes wrong - and we're about to fire up the VAE so lots of things can go wrong.\n",
    "\n",
    "What importants objects do I have in memory right now?\n",
    "\n",
    "* `tnfs`: A Numpy array of tnfs\n",
    "* `contignames`: A list of contignames\n",
    "* `lengths`: A Numpy array of contig lengths\n",
    "* `rpkms`: A Numpy array of rpkms\n",
    "\n",
    "I'm going to use `vamb.vambtools.write_npz` to save the Numpy arrays (that function is just a thin convenience wrapper for `numpy.savez_compressed`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "with open('/Users/jakobnissen/Downloads/example/contignames.npz', 'wb') as file:\n",
    "    vamb.vambtools.write_npz(file, np.array(contignames))\n",
    "\n",
    "with open('/Users/jakobnissen/Downloads/example/lengths.npz', 'wb') as file:\n",
    "    vamb.vambtools.write_npz(file, lengths)\n",
    "\n",
    "with open('/Users/jakobnissen/Downloads/example/tnfs.npz', 'wb') as file:\n",
    "    vamb.vambtools.write_npz(file, tnfs)\n",
    "    \n",
    "with open('/Users/jakobnissen/Downloads/example/rpkms.npz', 'wb') as file:\n",
    "    vamb.vambtools.write_npz(file, rpkms)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"encode\"></a>\n",
    "## Train the autoencoder and encode input data\n",
    "\n",
    "Again, you can use `help` to see how to use the module\n",
    "\n",
    "`>>> help(vamb.encode)`\n",
    "\n",
    "    Help on module vamb.encode in vamb:\n",
    "\n",
    "    NAME\n",
    "        vamb.encode - Encode a depths matrix and a tnf matrix to latent representation.\n",
    "\n",
    "    DESCRIPTION\n",
    "        Creates a variational autoencoder in PyTorch and tries to represent the depths\n",
    "        and tnf in the latent space under gaussian noise.\n",
    "\n",
    "        Usage:\n",
    "        >>> vae = VAE(nsamples=6)\n",
    "        >>> dataloader, mask = make_dataloader(depths, tnf)\n",
    "        >>> vae.trainmodel(dataloader)\n",
    "        >>> latent = vae.encode(dataloader) # Encode to latent representation\n",
    "        >>> latent.shape\n",
    "        (183882, 32)\n",
    "        \n",
    "    [ lines elided ]\n",
    "    \n",
    "---\n",
    "Aha, so we need to create the VAE, create the dataloader (and the mask), then use the `trainmodel` method first, then the `VAE.encode` method. You can call the `help` functions on those, but I'm not showing that here.\n",
    "\n",
    "Training networks always take some time. If you have a GPU and CUDA installed, you can pass `cuda=True` to the VAE to train on your GPU for increased speed. With a beefy GPU, this can make quite a difference. I run this on my laptop (with a puny Intel GPU), so I'll just use my CPU. And I'll run just 10 epochs rather than the more suitable 500:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\tNetwork properties:\n",
      "\tCUDA: False\n",
      "\tAlpha: 0.15\n",
      "\tBeta: 200\n",
      "\tDropout: 0.2\n",
      "\tN hidden: 512, 512\n",
      "\tN latent: 32\n",
      "\n",
      "\tTraining properties:\n",
      "\tN epochs: 10\n",
      "\tStarting batch size: 256\n",
      "\tBatchsteps: None\n",
      "\tLearning rate: 0.001\n",
      "\tN sequences: 24395\n",
      "\tN samples: 9\n",
      "\n",
      "\tEpoch: 1\tLoss: 0.507493\tCE: 1.0706676\tSSE: 55.385077\tKLD: 80.9332\tBatchsize: 256\n",
      "\tEpoch: 2\tLoss: 0.251523\tCE: 0.4548252\tSSE: 38.496042\tKLD: 124.8668\tBatchsize: 256\n",
      "\tEpoch: 3\tLoss: 0.223408\tCE: 0.4064800\tSSE: 33.100297\tKLD: 114.9226\tBatchsize: 256\n",
      "\tEpoch: 4\tLoss: 0.208888\tCE: 0.3823074\tSSE: 30.490180\tKLD: 106.1688\tBatchsize: 256\n",
      "\tEpoch: 5\tLoss: 0.201011\tCE: 0.3697399\tSSE: 28.965108\tKLD: 101.0825\tBatchsize: 256\n",
      "\tEpoch: 6\tLoss: 0.195216\tCE: 0.3593385\tSSE: 28.127550\tKLD: 97.5539\tBatchsize: 256\n",
      "\tEpoch: 7\tLoss: 0.191864\tCE: 0.3538129\tSSE: 27.525870\tKLD: 95.3914\tBatchsize: 256\n",
      "\tEpoch: 8\tLoss: 0.188392\tCE: 0.3483603\tSSE: 26.816342\tKLD: 93.2792\tBatchsize: 256\n",
      "\tEpoch: 9\tLoss: 0.184815\tCE: 0.3411324\tSSE: 26.516094\tKLD: 91.0830\tBatchsize: 256\n",
      "\tEpoch: 10\tLoss: 0.182677\tCE: 0.3379037\tSSE: 26.099772\tKLD: 89.2728\tBatchsize: 256\n"
     ]
    }
   ],
   "source": [
    "import sys\n",
    "\n",
    "vae = vamb.encode.VAE(nsamples=rpkms.shape[1])\n",
    "dataloader, mask = vamb.encode.make_dataloader(rpkms, tnfs)\n",
    "\n",
    "with open('/tmp/model.pt', 'wb') as modelfile:\n",
    "    vae.trainmodel(dataloader, nepochs=10, modelfile=modelfile, batchsteps=None, logfile=sys.stdout)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First we create the VAE, then we create the dataloader and the mask. The dataloader normalizes the TNF such that the mean and standard deviation for each tetranucleotide across all contigs (i.e. a column) is 0 and 1, respectively, and normalizes the rpkm such that each contig (i.e. a row) sums to 1. Furthermore, the dataloader shuffles the contigs at each epoch.\n",
    "\n",
    "The dataloader also discards all contigs where either the TNF vector or the depths vector is all zeros - we call these contigs *zero contigs*. A zero contig implies that it is either shorter than 4 basepairs, or that no reads mapped to it from any sample. The `mask` it returns is a boolean vector with `True` if a contig was kept, and `False` if it was discarded. We began with 50658 contigs, and the log above states that it was trained with 57762 contigs, implying that 57762 - 24395 = 33367 contigs were discarded. Normally, almost no contigs are discarded, but this test data contains very few reads.\n",
    "\n",
    "Here, we kept the default value `False` to the `destroy` keyword of `make_dataloader`. If this is set to `True`, the input arrays are normalized and masked in-place, modifying them. This prevents another in-memory copy of the data, which can be critical for large array sizes.\n",
    "\n",
    "I passed a few keywords to the `trainmodel` function: \n",
    "* `nepochs` should be obvious - the number of epochs trained\n",
    "* `modelfile` will be discussed in a bit\n",
    "* `logfile` is the file to which the progress text will be printed. Here, I passed `sys.stdout` in order to print the progress to this notebook\n",
    "* `batchsteps` should be `None`, or an iterable of integers. If `None`, it does nothing. Else, the batch size will double when it reaches an epoch in `batchsteps`. Increasing batch size helps find better minima; see the paper \"Don't Decay the Learning Rate, Increase the Batch Size\".\n",
    "\n",
    "The VAE encodes the high-dimensional (n_samples + 103 features) input data in a lower dimensional space (nlatent features). When training, it learns an encoding scheme, with which it encodes the input data to a series of normal distributions, and a decoding scheme, in which it uses one value sampled from each normal distribution to reconstruct the input data.\n",
    "\n",
    "The theory here is that if the VAE learns to reconstruct the input, the distributions must be a more efficient encoding of the input data, since the same information is contained in fewer neurons. If the input data for the contigs indeed do fall into bins, an efficient encoding would be to simply encode the bin they belong to, then use the \"bin identity\" to reconstruct the data. We force it to encode to *distributions* rather than single values because this makes it more robust - it will not as easily overfit to interpret slightly different values as being very distinct if there is an intrinsic noise in each encoding.\n",
    "\n",
    "### The loss function\n",
    "\n",
    "The loss of the VAE consists of three major terms:\n",
    "\n",
    "* Cross entropy (CE) measures the dissimilarity of the reconstructed abundances to observed abundances. This penalizes a failure to reconstruct the abundances accurately.\n",
    "* Sum of squared error (SSE) measures the dissimilary of reconstructed versus observed TNF. This penalizes failure to reconstruct TNF accurately.\n",
    "* Kullback-Leibler divergence (KLD) measures the dissimilarity between the encoded distributions and the standard gaussian distribution N(0, 1). This penalizes learning.\n",
    "\n",
    "All three terms are important. CE and SSE is necessary, because we believe the VAE can only learn to effectively reconstruct the input if it learns to encode the signal from the input into the latent layers. In other words, these terms incentivize the network to learn something. KLD is necessary because we care that the encoding is *efficient*, viz. it is contained in as little information as possible. The entire point of encoding is to encode a majority of the signal while shedding the noise, and this is only achieved if we place contrains on how much the network is allowed to learn. Without KLD, the network can theoretically learn an infinitely complex encoding, and the network will learn to encode both noise and signal.\n",
    "\n",
    "In normal autoencoders, people use binary cross-entropy rather than crossentropy. We believe crossentropy is more correct here, since we normalize by letting the depths sum to one across a contig. In this way, you can view the depths distribution across samples as a probability distribution that a random mapping read will come from each sample.\n",
    "\n",
    "In `encode.py`, the loss function is written as:\n",
    "\n",
    "$L = \\frac{(1 - \\alpha)}{ln(S)} \\cdot CE + \\frac{\\alpha}{103} \\cdot SSE + \\frac{1}{N_{L}\\beta} \\cdot KLD$\n",
    "\n",
    "where $N_{L}$ is number of latent neurons and S is number of samples. It is hardly obvious where the scaling factors come from, so let me try to explain it.\n",
    "\n",
    "As the learning rate is fixed and optimized for a specific gradient, this means the total reconstruction loss $\\frac{(1 - \\alpha)}{ln(S)} \\cdot CE + \\frac{\\alpha}{103} \\cdot SSE$ should sum to a constant, lest the training become ustable. To make things simpler, we want it to sum to 1. It would probably be more precise to say that L should be 1, but since $\\frac{1}{N_{L}\\beta} \\cdot KLD <<  \\frac{(1 - \\alpha)}{ln(S)} \\cdot CE + \\frac{\\alpha}{103} \\cdot SSE$ for any values of $\\alpha$ and $\\beta$ that seem to work, setting the reconstruction loss to 1 is basically the same.\n",
    "\n",
    "When optimizing the network, we want a single variable to control the ratio between $SSE$ and $CE$ - we call this $\\alpha$. This scales CE and SSE so that $\\alpha = \\frac{SSE}{SSE + CE}$ \n",
    "\n",
    "But here comes a problem. While we want to scale SSE and CE so that the two constrains above (namely $CE+SSE=1$ and  $\\alpha = \\frac{SSE}{SSE + CE}$) are true, we can't know *beforehand* what CE, KLD or SSE is. And, in any rate, these values changes across the training run (that's the point of training!).\n",
    "\n",
    "What we *can* reason about is the values of CE and SSE in a totally *naive*, network which had *no knowledge* of the input dataset. This represents the state of the network before *any* learning is done. What would such a network predict? Well, since we normalize the reconstructed depths across $S$ samples to be between 0 and 1 and sum to 1, the outputs would be close to $[S^{-1}, S^{-1} ... S^{-1}]^{T}$, which, by the definition of cross entropy, would yield a CE of $ln(S)$. We normalize both TNF inputs and outputs to follow an approximately normal distribution with mean around 0, which means the expected SSE for TNF is 1 per input TNF neuron, for a total of $SSE = 103$.\n",
    "\n",
    "Importantly, if we actually check the CE and SSE values for untrained networks, they are quite close to these expected values of $ln(S)$ and 103. Since neither CE nor SSE is reduced by more than an order of magnitude from their starting values during training on realistic datasets (usually much less), these expected values work as stand-ins for what we can expect CE and SSE to be across training.\n",
    "\n",
    "So the purpose of the scaling factors in front of the CE and SEE terms is to scale CE from the expected value of $ln(S)$ to the target value of (1-$\\alpha$) and SSE from 103 to $\\alpha$. Hence the scaling factors $\\frac{(1 - \\alpha)}{ln(S)}$ and $\\frac{\\alpha}{103}$.\n",
    "\n",
    "For KLD, we want the user to be able to control the ratio $\\frac{C+S}{K}$, where C, S and K are each of the terms from CE, SSE and KLD in the loss, respectively. Since KLD is defined as a sum of individual KLD for each of latent neurons, its value is proportional to $N_{L}$. So we let the user set a ratio\n",
    "\n",
    "$\\beta = \\frac{KLD}{N_{L}K} \\cdot \\frac{(1 - \\alpha)}{ln(S)} \\cdot CE + \\frac{\\alpha}{103} \\cdot SSE$,\n",
    "\n",
    "Where again, K is the weighed KLD in the loss. In other words, we allow the user to weigh the KLD-related loss relative to the reconstruction loss. However, as we just scaled the CE and SSE terms to make sure they sum to one, this simplifies to:\n",
    "\n",
    "$\\beta = \\frac{KLD}{N_{L}K} \\Leftrightarrow K = \\frac{1}{N_{L}\\beta} \\cdot KLD$.\n",
    "\n",
    "Hence the scaling factor $\\frac{1}{N_{L}\\beta}$ in front of KLD.\n",
    "\n",
    "If you look at the outputs from the 10 epochs, you can see the KL-divergence rises the first epoch as it learns the dataset and the latent layer drifts away from its prior. At epoch 2, the penalty associated with KL-divergence outweighs the CE and SSE losses. At this point, the KL will stall, and then fall. This point depends on $\\beta$ and the complexity of the dataset.\n",
    "\n",
    "Okay, so now we have the trained `vae` and gotten the `dataloader`. Let's feed the dataloader to the VAE in order to get the latent representation.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(24395, 32)\n"
     ]
    }
   ],
   "source": [
    "# No need to pass gpu=True to the encode function to encode on GPU\n",
    "# If you trained the VAE on GPU, it already resides there\n",
    "latent = vae.encode(dataloader)\n",
    "\n",
    "print(latent.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "That's 24395 contigs each represented by the mean of their latent distribution.\n",
    "\n",
    "Sometimes, you'll want to reuse a VAE you have already trained. For this, I've added the `VAE.save` method of the VAE class, as well as a `VAE.load` method. You will have noticed in the training example above that I defined a `modelfile`, a file the VAE will create and save its parameters to. We can always use that file to recreate the VAE and have a pretrained model. But remember - a trained VAE only works on the dataset it's been trained on, and not necessarily on any other!\n",
    "\n",
    "I want to **show** that we get the exact same network back that we trained, so here I encode the first contig, delete the VAE, reload the VAE and encode the first contig again. The two encodings should be identical.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([-10.1629,   6.6441,  -7.9587,  10.9046,   6.9817,  -2.1498,  -0.8452,\n",
      "         -8.6588,  17.6399,   3.3330,   8.3227,  -1.2341,  -5.0678,  -0.2922,\n",
      "          1.4167,  10.3114,   6.6064,   6.4869,   5.6234,   0.5074,  10.7002,\n",
      "          2.8435,   1.6936,   0.5806,  -6.0573,  -7.0818,  10.3524,   4.6124,\n",
      "         -7.7825,  -2.2355,  -8.7436,   2.7398], grad_fn=<SelectBackward>)\n",
      "tensor([-10.1629,   6.6441,  -7.9587,  10.9046,   6.9817,  -2.1498,  -0.8452,\n",
      "         -8.6588,  17.6399,   3.3330,   8.3227,  -1.2341,  -5.0678,  -0.2922,\n",
      "          1.4167,  10.3114,   6.6064,   6.4869,   5.6234,   0.5074,  10.7002,\n",
      "          2.8435,   1.6936,   0.5806,  -6.0573,  -7.0818,  10.3524,   4.6124,\n",
      "         -7.7825,  -2.2355,  -8.7436,   2.7398], grad_fn=<SelectBackward>)\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "# Manually create the first mini-batch without shuffling\n",
    "rpkms_in = torch.Tensor(rpkms[:128]).reshape((128, -1))\n",
    "tnfs_in = torch.Tensor(tnfs[:128]).reshape((128, -1))\n",
    "\n",
    "# Put VAE in testing mode - strictly not necessary here, since it goes in test\n",
    "# mode when encoding latent, as we did above\n",
    "vae.eval()\n",
    "\n",
    "# Calling the VAE as a function encodes and decodes the arguments,\n",
    "# returning the outputs and the two distribution layers\n",
    "depths_out, tnf_out, mu, logsigma = vae(rpkms_in, tnfs_in)\n",
    "\n",
    "# The mu layer is the encoding itself\n",
    "print(mu[0])\n",
    "\n",
    "# Now, delete the VAE\n",
    "del vae\n",
    "\n",
    "# And reload it from the model file we created:\n",
    "# Here we can specify whether to put it on GPU and whether it should start\n",
    "# in training or evaluation (encoding) mode. By default, it's not on GPU and \n",
    "# in testing mode\n",
    "vae = vamb.encode.VAE.load('/tmp/model.pt')\n",
    "depths_out, tnf_out, mu, logsigma = vae(rpkms_in, tnfs_in)\n",
    "print(mu[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "We get the same values back, meaning the saved network is the same as the loaded network!\n",
    "\n",
    "<a id=\"cluster\"></a>\n",
    "## Binning the encoding\n",
    "\n",
    "__The role of clustering in Vamb__\n",
    "\n",
    "Fundamentally, the process of binning is just clustering sequences based on some of their properties, with the expectation that sequences from the same organism cluster together.\n",
    "\n",
    "In the previous step, we encoded each sequence to a vector of real numbers. The idea behind encoding is that, because the neural network attempts to strike a balance between having low reconstruction error (i.e. high fidelity) and having low Kullback-Leibler divergence (i.e. containing as little information as possible), then the network will preferentially encode signal over noise. In this way, the VAE act like a denoiser. This has been explored in other papers by people before us. Furthermore, because the latent space has fewer dimensions than the input space, clustering is quicker and more memory efficient.\n",
    "\n",
    "With the latent representation conveniently represented by an (n_contigs x n_features) matrix, you could use any clustering algorithm to cluster them (such as the ones in the Python package `sklearn.cluster`). In practice though, you have perhaps a million contigs and prior constrains on the diameter, shape and size of the clusters, so non-custom clustering algorithms will probably be slow and inaccurate.\n",
    "\n",
    "The module `vamb.cluster` implements a iterative medoid clustering algorithm. The algorithm is complicated and not very elegant, so I will go through it here in painstaking detail.\n",
    "\n",
    "### Overview of Vamb clustering algorithm\n",
    "\n",
    "In very broad strokes, the algorithm works like this:\n",
    "\n",
    "    A) Begin with an arbitrary sequence P, and move from P to a center of a nearby cluster, called the medoid M.\n",
    "    B) Find the distance d from M where the density of other points drops according to certain criteria.\n",
    "    C) If no d can be found, restart from A). If it has restarted too often the last X tries, relax criteria in B). If criteria has been maximally relaxed, choose an arbitrary value of d.\n",
    "    D) Output all points within d of M, and remove them from dataset.\n",
    "    E) Repeat from A) until no points remain in the dataset.\n",
    "\n",
    "The idea behind this algorithm is that we expect the clusters to be roughly spherical, have a high density near the middle, and be separed form other clusters by an area with lower density. Hence, we should be able to move from P to M in step A) by moving towards higher density. And we should be able to find the radius of the cluster from M by considering a distance from M where the density of points drops.\n",
    "\n",
    "This algorithm consists of several steps, discussed below. Most of the steps unfortunately have several parameters which have been chosen by a combination of guessing, gut feeling and testing.\n",
    "\n",
    "As a distance measure, we are using cosine distance. This is particularly useful for a couple of reasons:\n",
    "\n",
    "First, it is extremely quick to calculate. Given two points represented by vectors $\\textbf{a}$ and $\\textbf{b}$, we define cosine distance as:\n",
    "\n",
    "\\begin{equation*}\n",
    "\\frac{1}{2} - \\frac{\\textbf{a} \\cdot \\textbf{b}}{2|\\textbf{a}| |\\textbf{b}|}\n",
    "\\end{equation*}\n",
    "\n",
    "This is slightly different from the ordinary definition, because it is squeezed between 0 and 1, rather than -1 and 1.\n",
    "\n",
    "Now if we preprocess all points by calculating:\n",
    "\n",
    "\\begin{equation*}\n",
    "\\bar{\\textbf{a}} = \\textbf{a} \\circ \\frac{1}{\\sqrt{2} \\textbf{|a|}}\n",
    "\\end{equation*}\n",
    "\n",
    "Where $\\circ$ is elementwise multiplication, then the cosine distance is simply:\n",
    "\n",
    "\\begin{equation*}\n",
    "\\frac{1}{2} - \\bar{\\textbf{a}} \\cdot \\bar{\\textbf{b}}\n",
    "\\end{equation*}\n",
    "\n",
    "And since the points are represented by a matrix $\\bar{\\textbf{M}}$, with each row vector being a point, finding the distance to from the i'th point of $\\bar{\\textbf{M}}$ to all other points is then a series of vector products as in the equation above, which can be expressed in a single matrix multiply and subtraction:\n",
    "\n",
    "\\begin{equation*}\n",
    "\\frac{1}{2} - \\bar{\\textbf{M}} \\times \\bar{\\textbf{M}}_{i}^{T}\n",
    "\\end{equation*}\n",
    "\n",
    "Second, with all the $N_{latent}$ dimensions being used by the neural network the surface of an $N_{latent}$-dimensional hyperphere is so large that it's not likely that multiple clusters are far in euclidian space but close in cosine space.\n",
    "\n",
    "And third, it works better than the other alternatives we tried in practice, although I don't understand why for instance Euclidian distance is not better.\n",
    "\n",
    "### Finding the medoid\n",
    "\n",
    "First, we shuffle the points pseudorandomly to avoid any bias introduced by the order of the sequences. Then, we use the following algorithm to find the medoid. The idea is that, as you move nearer to a cluster's centre, the mean distance to close points will decrease.\n",
    "\n",
    "Currently, the default parameters DC and MAX_STEPS are 0.05 and 25, respectively.\n",
    "\n",
    "Function `wander_medoid`\n",
    "    1. Pick next point S as the seed. If last point has been reached, begin from top again.\n",
    "    2. Let C be the set of contigs within a small distance DC of S\n",
    "    3. Calculate mean distance of points in C to S, ignoring the distance of S to itself. Let that be mean_S.\n",
    "    4. Sample a point P in C. While you have not yet futilely sampled MAX_STEPS times or exhaused all points in C:\n",
    "        Calculate mean distance of P to all points within DC of P, let that be mean_P\n",
    "        If mean_P < mean_S, set S to P and go to point 2.\n",
    "    7. Return S as medoid."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Finding the right distance\n",
    "\n",
    "Having found a medoid above, we need to find out if it's the center of a suitable cluster, and if so, what the radius of that cluster is.\n",
    "\n",
    "In order to see how this is done, it is useful to look at these graphs first:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.10000000000000002, 97.86, 'Clustering threshold')"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAEfCAYAAACkrrZ6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nO3deXwTdf7H8denLbQcQguFUm6wQCkoeF+Asq6C4IXKiouggiteKOvxE68FLxYXj9WFFREVBLxQlFXBcxfllEMEahEoUClHS0uhWFqgtN/fHzMpIfRKm2TS9PN8PPJoMpnMvDNJPv3mO9/MiDEGpZRSoSnM6QBKKaX8R4u8UkqFMC3ySikVwrTIK6VUCNMir5RSIUyLvFJKhbCQKvIicquImDIuf3S7v73TWatKRNqISJGIHBWR2Goua4aI7Czjvj/a2+oSj/nTvFzHeBH5Q3Vy1gYiMkJEttiv6wEfLC9NRGa43a7We99+7Hi32+NFxKvx1yIyRkSuq8r6VdWFVJF3Mxi4wOOyEvjCvr7HuWjVNgzrdasD3BTgdT8DDPLyMeMALfLlEJGWwDRgGda2+qOziSplOtZnyRtjAC3yARbhdAA/+dkYk1rGfVkBTQKISKQx5oiPFncLkAw0sq//y0fLrZAxZmug1uULPt7u/tQJCAdmGmOWOB2mMowxO4FSvwWq4BKqLflSlfaVVUTqi8hrIrJPRPJE5BMRudCe71a3+RaJyKJSllnW1+I+IjLX/ur9o31fhIg8KiK/isgREdktIi+KSFQl858PdAbeAWYBZ4lIt6psi6rw7K6xn88zIrJVRA6LSLaILBGRXvb9rq/zj7t1m413e/zNIrLO7bGzRCTeY52VfX1miMhOEblARJaJSAHwD/u+ISLyXxHJspexVkRuKeX5GRF5VkQeFJHfRCRfRL4Qkeb25UMRyRWRdBF5pJLbrIud+YCIFIjIChHp754bWGTf/M7OMKOc5V0uIgtEZI+dL9nOG16ZPJXIG25vA9fyF5X2Hiutu0ZE7heRjfbz3C8iq0VkkH1fGtAOGOr2Xphh35dgv/bb7cdus1/zGI/lu17jM0RksZ1vi4jcWUq+DvYyM+zP2jYRecVjnotF5DsR+V1EDonIVyLS3WOefvb7Kdd+72wSkb9Vbes6I1Rb8uEi4v7cjDGmqIx5p2F174wHVgOXAnN8kGEO8B5wA8e382zgKuB5rK/mXbG6QNoD11dimbcARfayGwKPA8OBEwqO/YFKM8ZcUpmgHtvKpTINgEeAv9o5fsb6dnE20MS+/wJgOTADeN2ettNe5x32tA+AR4GWwATgPBE50xiTZ8/vzevTGHgfeAF4DCiwp3cEPgImAsVAH2C6iNQzxkz1WMYwrG9KdwNxwD+x/qmeAix0yzNRRDYYYxaUtXHE6oZZAvwO3AvkAvcAX4jIlcaYhViv/xrgVfu+nyj/22ZH4Dusb3CHsbb3eKAZMLacx1XWeKxt9xLwtb38/1T0IBEZCrwIPA0sBuoBp3P8vTAIWACss9cBx59nSyAdqztnP9ZzfMye37NLqBHwLtbr8jRwG/CaiGwyxvzPztIBq3s2H/gbsAVoC1zulncgMB+rC/dme/IjwGIROd0Yky4iHe3n/pG9rqNY37o6VrQ9gooxJmQuwK2AKeWyxOP+9vbtLlgf+v/zWM6r9ny3uk1bBCwqZZ1pwIxSMrzsMV9ve/pwj+lD7ek9K3hukUAO8JXbtOXALiDcY95U4LtKbK8ZZWwv98slHvOnud3+HJhXwToM8KzHtHAgE/ifx/Re9vz3VeH1cT2XayrIE4b1T/cNYF0pWTcDEW7TXrKnP+E2LQLYC7xdwbpeAI4BCR7PfRPwk9u0P3pu60q+38XO8jhWcQyrxPuyfTnLiwHygKke0x+xHzvebdp4rMaT6/Zk9+dUxvLTgNmVeF4Rbu+FM0p5jft6fC72AdPcpr1jP4+W5azjpM8I1j+QbOCf9u0b7PU18uZ1CbZLqHbXDALOcbuMLGO+87A+KHM9pn/kgwyfeNzuj9US+Eisbo4IuwX9tX1/nwqWdzXWh/Adt2kzsVpBJ+yoM8YkGGMurWTOvZy4rVyXeyrx2FXAABF5TkR6iUjdSq6zC9Acjxa5sfqjfwMutid5+/oUYv3jOYGIdBKR90Rklz1PIXC7ncPTN8aYY263f7X/fuWW8xhWkWhTRg6XPsAK47Z/yFjfKN8DeopIowoefxIRiReR10XkN6z3UyHwLBCNtU2r4zSgAfChx/T3K/HYVVjP6V9ijcyqX9mVikhdEXlMrG7MAqzntNi+2/M1yjd2ix3AWPtcNmO11F0uBz43xuwuY32dgFOBOR6fxXyshpPrs/izneV9EblBRKq7fR0Rqt01yabsHa/uXP2/ez2mZ/ogg+cInuZAXeBQGfM3rWB5t2C9Cf8nItH2tK+w3oTDcStCXio0xqz2nOi2jvJMwOoyuBnr63WeiHwEPGyMyS7nca6v8KWNcspwu9/b1yfLeHTLiUhD4BusbTcW2IpVHO8CRpSyjP0et4+WM72ifSlNgLWlTM/A+ucVAxysYBklRCQMq/ugJVZL+lesLqlrsVrzldq3Uw7X9vbcvpX5PLxjr38kVldXoYgsAB4wxqRV8Ni/A6OxukSWYXVvtQbmcfJz8nwdAI54zNeU8ncKu4r1m/bF0w4AY0yqiPTD+iYzC4gUkZXAI8aY78t7QsEkVIt8ZbmKTHNgu9v0uFLmPYz1dc5Tk1KmgfU1z90+exm9y5i/1FYHgIjEAf2wXq9dpcwySEQaGWMqXTB8wRhTiLV/4XkRaQFcidW9UR+4sZyH5th/W5RyXwusPmrw7vWBk7c5WH267YDexm3kShn7IXwth7Kfo6H0glWeU7H6yIcZY2a7JorIVVVOeCLX9o4DfnGbXtb2LmGs/o3XgdftHaaXY/XRf4D1jaw8Q4B3jDHPuibY/5yrKhtoVc79++y/jwLflnK/6x879reG/4lIJHAR1j+iL0SkfQUNmaARqt01lbUS68M22GO6522wuhE6u3dJiEgfrB1ylfElVmujsTFmdSmXMos8Vr99BFbrs6/HZQzWTq7SMgeMMSbDGDMd60PjPkLhKFY+d5uwWodD3CeKyIVYBXmRPcmb16csrm6DQrf1xADXeLGMqvoeOF9OHM0VjvUPcG0V/imX9lzqYL0/fGE91jfNP3lMH1LKvGUyxuw3xnyA1e3j/l44wsnvBbCeV6HHtNu8WaeHr4ErxWOklptNWPsHupXxWVzv+QBjzBFjzH+xRmw1ADpUI19A1eqWvDHmVxF5F3jG/iq8BuvHKK6WUbHb7O8DdwBv2UO/OgAPYI2YqMy6FonIe1h98i9hFbBirJE1A7C+Am4u4+G3YLVkX7dbTCVEZDHwf1hdNm/a01KB37zol68SEZmPNVriJ6xW6RlY+x5ed5stBRgoIl/a8+w2xuy2h6G9LiKzsUYdtQKewxoJ8RZ4/fqUZRlWl8gUERmH9QF9Aqu117iqz72SXsba4fmNve6DWF0ZnYGBVVjeRqzGxnMiUoRVGP/qm6hgjDkgIi9jDXn9HatYlrdPq4SITMPqZlmO1b3WGWuk0tdus6UAvUXkSqwuq2y7K+dL4BYR2YC1r+M64MJqPJVxWJ+pZSIywV5mK6C/MeZmY4wRkXuA+Xaj7UOs90Ocvd4dxpiX7KGZfbBG+aQDsVit/91YI7BqBqf3/PrywvERBAkV3N/ebVp94DWsr9Z5WH2eAyllpAYwCqsIFWAVj7MoexTDSRmwvjndj1UYD2P9g1iH1TpoXEbmnvbynizneT+HVfA62LfTKGUkUCmPmwHsLOO+k0Z8cPLomgeBFVhffwuwWkjjgTpu81yEVZwPc/IIjZvt53/EXsYsIN4jR6Venwqeyx+w+sYLsPrk78NjdIg9X2kjgUp9PbG+bSypxDbuAnxqv9aH7e3Vv6JtXc7yemINy8zH6nd+Gmsnsuf7uqz3ZfsKlh+OtSM3w95ei4CkUl67E7YfVkNkEVaBP4LVKHkZt5EpQCLWDtV8e3kz7OmxWI2o/fZlDtY/l9JGUJ30GlPKyDesrq33sIr3Yft1f8ljnguwdtTvt+dJs3Nc4Hb/fKwCfwSrO2su0KWi1ymYLmI/GeVGRB7CKrztjTE7nM6jTqSvj1KVV6u7awDsr47dsYZLFWPtGH0I+FALiPP09VGqemp9kcfqR7wWa3hdA6zRK69i9esp5+nro1Q1aHeNUkqFsNo+hFIppUJaQLprYmNjTfv27QOxKqWUChlr1qzJNsY0q84yAlLk27dvz+rVJ/1yXimlVDnsYxRVi3bXKKVUCNMir5RSIUyLvFJKhTAt8kopFcK0yCulVAjTIq+UUiFMi7xSSoUwLfJKKRXCtMgrVQVvvvkmIkJ+fr7TUZQqlxZ5pby0b98+Xn75ZQC+//57iosrc4IqpZyhRV4pL2zdupXY2Fh++cU6z/WAAQO45ZZb2LVrF0OHDmXv3r0OJ1TqRHo8eaW8kJaWdtK02bNnExMTw7vvvkvbtm35+9//Tl5eHlFRUURE6EdMOUtb8kp54dixYyXXGzRoQPPmzQGYPHkyYWFhTJ06lfT0dE455RTuuusup2IqVUKLvFJe2L9/f8n1G2+8kcmTJwNgjGHKlCkcPnyYtm3bAjB9+nRHMirlTou8Ul44cOBAyfXmzZuTlJRUcvv2229n2rRpJbcbNWqEnnlNOU2LvFJecG/Jx8XF0alTJwAiIyOJiIhg2LBhbNy4kYcffpiDBw+Snp7uVFSlAC3ySlVKXl4eOTk5JxX5unXr8sknn5SMtgFITEzk2muvBWD9+vUBz6qUOy3ySlXCH//4R5o2bXpCkXftdL322ms59dRTT5i/W7duACcUf6WcoEVeqUr48ccfAevHTwAdO3ake/fuZc7fuHFjmjdvTmpqakDyKVUWLfJKVaCoqKjk+pYtW7jooovYunUrcXFx5T4uISFBi7xynBZ5pcqxe/durr766hOmRUdHV+qxnTp1YsuWLf6IpVSlaZFXqhSvvPIK//nPf5g9ezYLFiwAoF+/fgCVPlZNQkICu3bt0oOYKUdpkVeqFGPGjOGaa66hXr16JdNGjx4NwE8//VSpZSQkJADW8W6UcooWeaXK4TpWzbhx4+jXrx9RUVGMGzeuUo/t2LHjCctQygl69CSlPLgfn2b+/PlERUUxfvx4AAoKCiq9nFatWgGwa9cun+ZTyhvaklfKw8GDB0uub926lUaNGlVpOXFxcYSFhbFz505fRVPKa1rklfKQm5t7wu3GjRtXaTkRERHEx8drS145Sou8Uh7cD0IGVLklD1aXjRZ55SQt8kp58FVLHrTIK+dpkVfKg6vIu44Lry15VZNpkVfKg6u7xnXQseq25HNzc8nLy/NJNqW8pUVeKQ+ulrzrx0ynnHJKlZflGiuvhzdQTtEir5QHV5F3FWgRqfKyevbsCcDatWurH0ypKtAir5SHAwcOUL9+/ZIW/NGjR6u8rISEBBo0aMDPP//sq3hKeUWLvFIecnNziY6Opm7dugAUFhZWeVlhYWH06NFDW/LKMVrklfKQm5tL48aNS3a41q9fv1rL69mzJxs2bPBFNKW8pkVeKQ9ZWVk0bdqU6667jqeffppnn322Wstr3749ubm5JxwuQalA0SKvlIfdu3fTqlUrIiIiePLJJ6s1hBL0QGXKWVrklXJjjGH37t20bNnSZ8ts3bo1oEVeOUOLvFLAvn376NevH6tXr+bQoUN+KfJ6NErlBC3ySgHLly/n66+/5txzzwXwaZF3LUuLvHKCFnmlgMzMzBNuu/rRfSEqKorY2FjtrlGO0CKvFCe3sn3ZkgeryyY9Pd2ny1SqMrTIK4W1UzQuLq7kdnx8vE+Xf/rpp/Pf//6XX375xafLVaoiWuSVwiryrVq1YsKECXTt2pWGDRv6dPnPP/88devW5cUXX/TpcpWqiBZ5pbC6a1q1asWjjz5KSkqKz5ffokULEhMTtctGBVyE0wGUctK0adM4cOAAu3bt4sILL/TruuLj4/WQwyrgtMirWm3UqFEl113j2f0lPj6eH374wa/rUMqTdteoWq1JkyYAtGvXjltuucWv64qPjycnJ4cjR474dT1KudMir2q1wsJChg4dysaNGwPSkgfIyMjw63qUcqdFXtVahw8f5vfffycpKYl69er5fX2uIr97926/r0spFy3yqtbKysoCoFmzZgFZn+sHVnv27AnI+pQCLfKqFtu7dy8AzZs3D8j6XC15LfIqkLTIq1or0C352NjYE9arVCBokVe1VqCLfEREBDExMWRnZwdkfUqBFnlViwW6uwagadOm7Nu3L2DrU0qLvKq1srKyqFOnDo0aNQrYOrXIq0DTIq9qrd9++41WrVohIgFbZ2xsrHbXqIDSIq9qrdTUVDp16hTQdWpLXgWaFnlVK2VkZLBlyxYt8irkaZFXtc6SJUuIj48nNzc34EU+NjaWQ4cOcfjw4YCuV9VeWuRVreN+vHgnWvKAtuZVwGiRV7WO+y9OExISArpu1w+itMirQNEir2qdtLQ0ACZNmkTnzp0Dum5XS/7ss88mLy8voOtWtZMWeVWrDB8+nBkzZnDhhRfy0EMPBXT4JECPHj2IioqisLCQ1atXB3TdqnbSIq9qjaNHjzJr1iwA2rdv70iGmJgYUlNTAfjll18cyaBqFy3yqlZ49913GTRoUMltJ7tKWrZsSXR0NMnJyY5lULWHnuNV1QqzZs3iyy+/LLk9ePBgx7KICN26ddOWvAoILfKqVti4ceMJ1xMTEx1MA927d2fu3LmOZlC1g3bXqJB36NAhfvvtt5LbTvXHu2vVqhU5OTkUFhY6HUWFOC3yKuRt2rTphNtRUVEOJTmucePGAOTm5jqcRIU6LfIq5Ln/wjVYaJFXgaJ98irkrV+/nsjISD7++GPq1avndBwAoqOjAS3yyv+0yKuQt3r1anr06MHAgQOdjlLC1ZI/cOCAw0lUqNPuGhXSiouLWbNmDWeffbbTUU6g3TUqULTIq5CWmprKwYMHOeuss5yOcgIt8ipQtMirkLZhwwYAevbs6XCSE2mRV4GiRV6FtO3btwPQsWNHh5OcSPvkVaBokVchbfv27URHR5eMZgkWERERNGjQgPHjxzNy5Ein46gQpkVehbTt27fToUMHp2OUqm7dugC89dZbDidRoUyLvAppwVzk9+/f73QEVQtokVchyxhDWlpa0BZ5l0CfuETVLlrkVUgqKiqic+fOHD58OOh2unoKhmPpqNClRV6FpKVLl5Kamkrfvn258cYbnY5Tqs8++4yEhAQKCgrIz893Oo4KUZUu8iKih0BQNYIxhtmzZxMVFcV//vOfkpNnB5srr7ySRx55BIDs7GyH06hQ5U1Lfo+IvCAiXf2WRikfmDhxIm+88QaDBw+mYcOGTscpV2xsLKBFXvmPN0X+MeBCIFlElovISBEJ7k+QqpWWLVtGYmIi06dPdzpKhVxF/ptvvqGoqMjhNCoUVbrIG2PeMMZcCHQHlgDPYrXu3xKRi/wVUClvZWdn07p165Jx6MGsWbNmAIwdO5bZs2c7nEaFIq93vBpjNhpjHgZaY7Xu/wz8ICK/isidIqI7c5WjsrOzS1rIwc49Z3p6uoNJVKjyuiCLSF0RGQIsBF4GVgC3Am8DTwLv+jKgUt7at29fjSnyMTExJTuGMzMzHU6jQlGlR8yIyJnACOAmoBB4B7jXGLPZbZ7PgdW+DqlUZR07doz9+/cH7YgaT2FhYWRlZXH66adrS175hTfDIlcBXwN3APONMcdKmScNeN8HuZSqkpycHIAa05IH6xevbdq00SKv/MKbIt/RGPNbeTMYYw4Bt1UvklJVt2/fPqBmFXmAtm3bsmrVKqdjqBDkTZ/8/0TkpO/AIhItItt8mEmpKtmyZQtJSUkANaa7xqVNmzZkZ2dTUFDgdBQVYrwp8u2B8FKmRwKtfJJGqWp4++23S67XxJY8wLZt2l5SvlVhd42IXOd2c6CIuJ+vLBy4FKsvXilHNWjQoOR6TSvyffr0AWD+/Pl069bN4TQqlFSmT/4j+68B3vS4rxCrwD/ow0xKVYn7EMSa1l3Trl07evXqxZw5c3jsscecjqNCSIXdNcaYMGNMGLADaO66bV8ijTFdjDGf+z+qUuXbs2cPzZs3Z+HChdSvX9/pOF7r378/KSkpekRK5VPeHNaggzFGj6KkglZGRgZJSUn079/f6ShV0rx5c+D4MFClfKHc7hoReQD4tzHmsH29TMaYl3yaTCkvZWRkcM455zgdo8pcXUz79u2jdevWDqdRoaKiPvnRwEzgsH29LAbQIq8ctWfPHlq0aOF0jCpr0qQJcHysv1K+UG6RN8Z0KO26UsEmLy+PQ4cOER8f73SUKnNvySvlK9U6YqSI1PFVEKWqY/fu3QA1uiWvRV75gzen/7tPRK53u/0WUCAim0Ski1/SKVVJycnJACQmJjqcpOpc3TW641X5kjct+fuALAAR6QMMxjqW/M/Ai76PplTlrVu3jrCwMLp37+50lCqLioqifv362pJXPuXNAcpaAdvt61cBc40xH4rIBmCxz5Mp5YV169bRuXNn6tWr53SUamnatKkWeeVT3rTkDwLN7euXAd/Z1wuBKF+GUspb69at4/TTT3c6RrU1bdqUmTNn8uyzzzodRYUIb4r818AbIjIdSMA6MxRAN4638JUKuA8++IC0tDR69+7tdJRqS01NBeDJJ5/EGONwGhUKvCny9wBLgWbADcYY196hM4H3fB1MqcoaN24cZ511FqNGjXI6SrV16tSp5HpaWppzQVTI8OawBgeNMaONMdcYY750mz7OGDPBP/GUKt/Ro0dJTU1lwIAB1KlT80f0LliwgI8//hiARYsWORtGhYSqnMi7pYj0FJEz3S/+CKdURbZu3UpRURGdO3d2OopPtGjRgkGDBnHKKaewdu1ap+OoEODNibzPAGYDiYB43G0o/YQiSvnVpk2bAOjSJXR+quE65+uuXbucjqJCgDdDKKcB6cBfgN1YhV0px0ybNq2kHz5UWvIurVu31hN7K5/wpsgnAWcYYzb7K4xSlXXkyJETdrQ2btzYwTS+17p1azZs2OB0DBUCvCnyG4AWgBZ55bh58+YBMGzYMM466yyH0/hemzZtyMjIoLCwMCR2KCvneFPkHwP+ISJPYBX8Qvc73YZUKuV3CxYsIDY2lhkzZhAWVq3j7AWl1q1bY4xh9+7dtGvXzuk4qgbzpsh/a//9mhP74wXd8aoCqLi4mK+++op+/fqFZIEHSk4asnPnTi3yqlq8KfJ9/ZZCKS+kpKSQlZXFZZdd5nQUv2nbti0AK1eu5KKLLnI4jarJKl3kjTHf+zOIUpW1bds2AJKSkhxO4j+JiYlccsklPPbYY1xzzTV07NjR6UiqhvLqu66InCYik0VkoYjE29OutcfQKxUQO3bsAI63dkNRWFgYs2bNorCwkGnTpjkdR9Vg3pw05HJgFdYhh/8AuI7peiowzvfRlCrdjh07iIyMpHnz5hXPXIO1bt2agQMH8vbbb1NUVOR0HFVDedOSfwZ4wBgzCDjqNn0RcK4vQylVnh07dtC2bVtEPH94HXr69+/P3r17ycjIcDqKqqG8KfLdgQWlTM8BmvgmjlLl27hxI6tWrQrprhp3LVu2BGDPnj0OJ1E1lTdFPgerq8bTmcBO38RRqmzFxcUkJSWxbds24uPjnY4TEK4i7zpRuVLe8qbIvwtMEpHWWOPiI0TkYuAF4B1/hFPKnfvP/Bs2bOhgksBx/TPTlryqKm/GyT8BzAB+w/oBVArWP4k5wHM+T6aUh2+/tX6P989//pMbb7zR4TSBERcXh4hokVdV5s04+UJgqIg8idVFEwasNcZs8Vc4pdwtXryYzp07c//99zsdJWDq1KlDs2bNtLtGVVm5RV5E3qrg8f1dIxyMMSN8FUqp0qSnp5OQkOB0jIBr2bKltuRVlVXUkm/mcbsPUIx1gDKwRtyEAT/4OJdSJ8nIyOCMM2rf7+7i4+O1Ja+qrNwib4y5ynVdRB4FCoDbjDGH7GkNgDc5XvSV8ouioiIyMzNp0aKF01ECrkOHDixevJi8vLxas8NZ+Y43o2vuA8a7CjyAff0ZYLSvgynlbt++fRQVFdXKIj906FDy8vJ49913nY6iaiBvinxDoGUp0+OB+r6Jo1TpXL/4rC3j491dcMEFnHbaacyaNcvpKKoG8qbIfwy8LSJDRKS9fRmC1V0zzz/xlLK4djzWxpa8iNC/f39WrlxJQUGB03FUDeNNkb8L+AxrrPxW+zIT+AK42+fJlLItWbKE/v37A7WzyAP06dOHo0eP8uOPPzodRdUwlS7yxpgCY8zdQFPgDPvSxBhztzEm318BlXrrreMjeWtrke/VqxcAffv25euvv3Y4japJvD53mjHmkDFmvX05VPEjlKqevXv3llxv0KCBg0mcEx0dzcsvvwzAp59+6nAaVZOE5gkyVUhZv349N910E7///rvTURw1ZswYLrvsMpYuXep0FFWDaJFXQW3//v2kp6fTs2dPHSMOXHTRRWzYsIHc3Fyno6gaQou8ClrHjh1jxAjraBnnnqvnpQGryBtjdAesqjQt8ipovfbaa3z66adMnDiRiy++2Ok4QeG8884jLCyMDz/8kGXLljkdR9UAYozx+0rOPvtss3r1ar+vR4WWVq1akZiYyLffflsrTvVXWWeccQY///wzYH3bCQ8PdziR8hcRWWOMObs6y9CWvApK2dnZ7N69m4EDB2qB9+D+q9/Nmzc7mETVBFrkVVDauHEjAF27dnU4SfB58sknadXKOhPn2rVrHU6jgp0WeRWUXEU+KSnJ4STB54ILLmD79u1ERkby008/OR1HBTkt8ipoGGP46aefMMawceNG6tevT5s2bZyOFZTq1KnDaaedVtI3r1RZtMiroLFw4ULOOusspk6dyooVK+jatSthYfoWLUtCQgLbt293OoYKcvoJUkEjJSUFgLvvvpsVK1Zwyy23OJwouLVr14709HSKi4udjqKCmBZ5FTQ2bdpUcr13797ceeedDqYJfm3btqWwsLDkWPtKlZU3O/gAABbMSURBVEaLvAoaKSkpXHzxxRQWFvLDDz9Qp04dpyMFtXbt2gHw22+/OZxEBTMt8iooGGNISUmha9euRERUdH55BVZLHmDHjh0OJ1HBTIu8CqjDhw/TrVs3Jk+eDEBBQQE5OTlkZmZy4MABHTLpBVdLfsiQISX7M5TypEVeBdScOXNISUlh8uTJGGMYPXo0TZs2LTlWuhb5ymvUqFHJdfcTqyjlTou8ChhjDC+//DIRERFs2rSJDz/8kDfffBOAf/zjH4D+wtVby5Yto27duifstFbKnRZ55VfGGBYuXMjChQupV68ev/zyC5MmTSIhIYEhQ4YA0KNHj5L53Y/Loip2wQUXcN1117Fq1So9LaAqlRZ55VdLly5lwIABDBgwgCNHjhAXF8ddd91FcnIyo0ePpmvXrjz99NMl8+vByLzXrVs3MjMz6devH1u3bnU6jgoyWuSVX7kf8/yOO+5gzZo1REZGEhkZyauvvkpKSgp/+MMfAGjZsqVTMWu0Ll26lFz/9ddfHUyigpEWeeVXS5YsAawW+l133VVy9ER3DRs2ZP78+fzwww+BjhcS+vfvz+DBgwHYsmWLw2lUsNEir/ymuLiYpUuXMmLECLZv307Pnj3LnPfqq6/m1FNPDWC60HHKKafwwQcfEB0dzebNmykqKnI6kgoiWuSV3/z666/k5OTQu3fvkjHdyj9EhE6dOvHaa6/Rt29fLfSqhBZ55TeurppevXo5nKR2yM/PB2Dx4sUlQ1OV0iKv/GbJkiXExcVpN0yATJgwgZtuuomEhAQWLFjgdBwVJPQgIcrnUlNTWb58OYsWLaJXr146LDJArr76aq6++moGDBhAenq603FUkNCWvPK5oUOHMnz4cNLT0+nfv7/TcWqdNm3aaJFXJbTIK587fPhwyfUrrrjCwSS1U5s2bcjKyqKgoMDpKCoIaJFXPufaAdi6detSx8Ur/3Idgnjnzp0OJ1HBQIu88qnCwkLS0tJ48MEH2bx5s9NxaiXXyc/nzp2rQymVFnnlWzt27ODYsWN069aNevXqOR2nVnIV+ccff5z58+c7nEY5TYu88inXsVMSEhIcTlJ7uYo8wC+//OJgEhUMtMgrn8nMzOTOO+8kJiaG0047zek4tVZkZCQHDhygbdu2epx5pePkle88+eSTZGZmsmLFCqKjo52OU6s1btyYrl276lEplbbklW/k5OTw5ptvcuedd3LmmWc6HUdhHYL4119/xRjjdBTlIC3yyieWL19OcXEx119/vdNRlC0xMZFDhw6xbds2p6MoB2mRV9Vy7NgxioqKWLJkCeHh4ZxzzjlOR1K2K664grCwMKZNm+Z0FOUgLfKqyubOnUudOnXo06cPEydOpEuXLtSvX9/pWMrWvn17brjhBl5//XUdL1+LaZFXVTZr1izg+Cn+Ro0a5WQcVYr+/fuTm5urXTa1mI6uUVW2d+9ewDozUVpaGk2aNHE4kfLUvXt3AJKTk+nUqZPDaZQTtCWvvHbs2DFuueUWfvzxR+644w42bNigBT5IJSUlISJ88803HDhwwOk4ygFa5JXXNmzYwDvvvAOgp/YLcg0aNMAYw2uvvcaVV17pdBzlAC3yymtr1qwBrJNUaOEIfq7DPS9dulRb87WQFnlVoeLiYtavX1/yo5o1a9bQqFEjPvnkE/1law0wZ84cPv74YwA+++wzh9OoQNMir0pljCkp6mPHjqVHjx689NJL3HvvvUydOpWePXsSFqZvn5ogJiaGa6+9lk6dOjFx4kSOHTvmdCQVQPopVaW69957adWqFVdddRWTJk0iNjaWhx56iClTpgBw3XXXOZxQeSMsLIy///3vpKSkaGu+ltEhlOokqamp/Pvf/wbg888/59JLL+Wjjz5iypQpHDlyhKeeekpPzl0DXXXVVdSpU4dVq1YxaNAgp+OoANEir04ya9YswsLC2L17N7///jutW7cmKiqKxx9/3Oloqhrq1q1LUlISP//8s9NRVABpd406wZEjR1i3bh1dunQhLi6OhIQEoqKinI6lfKRnz55a5GsZLfIKgHnz5jF58mSioqKYP3++nvQjRPXs2ZM9e/awa9cup6OoANHuGkVBQcFJhwg+/fTTHUqj/Onyyy+nTp063H777cyfP5+6des6HUn5mbbka7HCwkIAFi5ceNJ9epyT0JSUlMSUKVP48ssv+dOf/qRHp6wFtMjXUhs2bKBRo0bceOONjB07lmbNmnHgwAG2b9/ODTfcQL9+/ZyOqPzkL3/5C6+88grz58/n+eefdzqO8jMJxKnBzj77bLN69Wq/r0dV3qhRo0pOJhEWFsbcuXN17HstYoyhf//+bN68mW3btumQ2CAlImuMMWdXZxnakq+F8vLymD17NrfddhsZGRls2LBBC3wtIyJce+21pKWlsXnzZqfjKD/SHa+1QFFREYsWLeLw4cN89dVXnH/++eTn53PrrbcSFxdHXFyc0xGVA1xdcgsXLqRLly4Op1H+ot01tcALL7zAww8/TFhYGMXFxQC0bNmS9PR0Pf5MLXfmmWdSWFjI+vXrtcsmCGl3jTpJcnJyyQGoPvnkE0aNGsUTTzxB3bp1Swo8wOjRo7XAK0aPHk1ycjKLFi1yOoryE23Jh5DNmzfTpUsXRo4cybRp02jZsiWZmZlcdtllTJ48mTVr1nDqqaeyd+9ePQ68AuDw4cPEx8dz5ZVXlpyzVwUPX7TktU++hurVqxdXXHHFCceTmTNnDgBvvvkmSUlJZGZmMnv2bIYOHQpA586dHcmqgldUVBRDhgxh5syZTJkyhUaNGjkdSfmYtuRroP3795ecU3XevHkMHDiQkSNHMnv2bDp27Eh+fj4ZGRlERkayd+9e/eCqcv3444+cf/75TJ8+nZEjRzodR7nRlnwt9OqrrzJ37lwAGjVqxPDhw0lKSmLlypX8+c9/ZsyYMbRv35558+bRqVMnLfCqQueeey6JiYnMnDmT888/n/DwcBITE52OpXxEW/I1xMGDB8nOzqZr164cPXoUgC+++IJ//vOfbNy4kQkTJjBs2DCHU6qa6vnnn2fs2LE0bdqUFi1akJyc7HQkhW9a8lrka4Bly5ZxySWXEBcXx86dO0um5+bmaktd+URWVhZt2rThyJEjgHXYi+7duzucSukQyhA3Z84cxo0bx5gxYygsLGTnzp3ceuutJfdrgVe+0qxZM0aMGEG7du0ICwtj0qRJBKIBqPxPW/JBZvXq1SxatIibb76Z+Pj4kulTpkyhuLiYkSNHUr9+fQD9ECqfKioqorCwkKeeeoqJEydy77338uqrr+qPpBykO15DzNtvv82IESMAmDlzJgAXXXQRN9xwA3fffXfJfH/9619p1aqVIxlV6AoPDyc8PJwJEyZQWFjIiy++SJ8+fRg8eLDT0VQ1aEs+SBhjiI2NpXv37mzatInMzEzOOeccVq5c6XQ0VQsVFRXRtWtXYmJiWLFihbbmHaJ98iFk+/bt5OTkcPPNN3PzzTcDlLTqlQq08PBwHnjgAVauXMnSpUudjqOqQYt8kHB90znrrLO4/fbbGTRoEH/+858dTqVqs+HDhxMbG8ull15a8mtqVfNod43DMjIy2Lx5MwMHDiQ/P5+CggI976YKGq+++ir3338/APPnzyczM5Nhw4YRFRXlcLLaQbtraqB58+bxf//3fxw9epSlS5cSHx/PxRdfTF5eHr1799YCr4LKfffdR35+Pp07d+aaa67hjjvu4JVXXnE6lvKCjq4JkIKCAsaNG8eLL75IcXExK1asYP/+/QBMnTqVAQMGEBMT43BKpU5Wr149RowYwdixYwFYtWqVw4mUN7Ql7wfGGLZv314yjv3IkSM8+uijTJo0iZtuuolp06axfv16kpOTueeeexg1ahRt2rShYcOGDidXqnQ33XQT4eHhAHz77be8+OKLrFmzxuFUqjK0T94P/vWvf3HfffcxcOBAGjZsyDfffENOTg533HEHr7/+OgA5OTnMnz+fq6++mqZNmzqcWKmKpaSksH37dgYNGkRhYSH16tVj+vTpbNiwgQcffJDY2FinI4YcPXZNEHG13jdt2sTgwYM5dOgQDRo0oEmTJnTq1ImRI0dy/fXXExkZ6XRUpaqlqKiInTt30rt3b9LT0wGYOHEijzzyiMPJQk9I/uJ1woQJ7N27lzPOOIPw8HD+9Kc/BfXOyKysLJ5//nm+/fZb1q1bB0D79u3ZuHEjrVu31h+RqJATHh5Ou3btSE5OZs2aNdx22218/vnnPPjggwDk5+fTsGFDPb1kkHC8yOfn5zN16lSysrKYOHHiSfe/8cYbvPHGG0yZMoVLLrmEQYMGOZDyRMYYtm7dytq1a3nuuedYt24dp512GpMnT6Z58+ZccsklNGvWzOmYSvlVo0aN6Nu3L8OHD+eZZ56hQYMGdOzYkYyMDE4//XSGDh3KsGHDqFevntNRa7WAddesWrWKgoICtm3bRnJyMjt37kREeO+9907agZOSkkJkZCRLly7lL3/5S8nhT8PDwxkwYAB79uwhNzeXZ555hh07dpCVlUViYiLnnnsuzZo1Q0SIiYnhwIEDpKam0qFDByIiIsjLyyMnJ4eoqCgiIiLIycmhcePGZGdn06NHD/Lz8wkPDycrK4vs7GyOHj1KRkYGkydPJiYmhk6dOjFnzhyKioo4dOhQyQmzn3rqKf72t7/5fTsqFYx27NjBo48+SnR0NK+99toJB86LiYmhc+fOjBw5kiZNmpCRkcE111xTMrIsKSmJ8PBwjhw5Qn5+vo4w81Bj+uRjYmJMXl5eSVF0165dOx577DEA+vbty6FDh+jZs2fJ/T///DPfffcd5513HjNnzmT58uU0aNCAbdu2kZ2dDUCdOnUoLCw8adki4pMjNXbt2pW8vDzS09MZNGgQ0dHRnHLKKdx6663k5ORw8cUXExHh+JcipRz30Ucf0bx5c84//3y+++473n//fZYtW0ZqamqZj6lXrx6FhYUcO3aMBg0aaBenm7y8vJpR5EXkd2CT31dUM8QC2U6HCBK6LY7TbXGcbovjuhhjTqnOAgLV/NxU3f9GoUJEVuu2sOi2OE63xXG6LY4TkWoPS9Td30opFcK0yCulVAgLVJGfFqD11AS6LY7TbXGcbovjdFscV+1tEZAdr0oppZyh3TVKKRXCtMgrpVQI0yKvlFIhzCdFXkTCRWSSiGSJyO8i8rGIlHncURHpLyK/iEiBiCSLyOW+yBEMvNkWIjJARP4rItkisl9EFotI70Bn9hdv3xduj7tLRIyIPBGInIFQhc9IcxGZKSL7ROSgiPwsIi0DmdlfqrAtHhKRrfa8W0Tk7kDm9RcRGWJ/5g+KyMmHAzh5/rNFZKWI5Nvb4+bKrMdXLfmxwDXAeUBre9qs0mYUkY7APODvQGP77yci0t5HWZxW6W0BxAD/AhKAZsC7wEIRaePvkAHizbYAQETaAQ8CG/wbLeC8+YxEAd8BR4EuQDQwFMjzf8yA8GZbXA08BQy1f/k5HJgkIpcFIqif7Qf+DYypaEYRaQwsBD7Gqht3AlNF5IIK12KMqfYF+A0Y6Xb7VMAA7UqZ9ylgsce0xcA4X2Rx+uLNtijj8RnAdU4/D6e2BfAtcCOwCHjC6efgxLYARgHpQB2ncwfBtngAWOYxbTnwkNPPw4fb4xLgWAXz3GZvN3GbNgt4u6LlV7slLyLRQFug5FCSxpitwEGgRykP6eE+r+2nMuatUaqwLTwffxrWcTtqfCu2KttCREYBh4wxHwQkZIBUYVv0BbYAM+zuml9F5K8BCetnVdgW7wONROQiEQmzuzM7A18GIm8Q6QGsNXZ1t1Wqbvri2DWug+fkekw/ADQqY/7S5u3mgyxO83ZblBCR5lhfxV4wxmzxQ7ZA82pbiEhb4AngfD/ncoK374tYrEI/BqsFdzrwpYjsNcbM8VvKwPB2W+wFPgL+x/Hu5THGmGT/xAtaZdXNcusK+KZP/nf7b2OP6dFY/51Lm7+y89Y03m4LAOwdav8DvgYe9U+0gPN2W0wHnjXG7PJrKmdU5TOyyxjzijHmqDFmNTAbqx+7pvN2WzwJ/BnoCdTBarn+VURG+i1hcKpy3ax2kTfGHAB2AGe6ptk7VxsB60t5yDr3eW1n2NNrtCpsC+wdzouBhcaYez2+jtVYVdgWlwET7JFG2cBFwKMisjgQef2pCtviZ6w+6pMW5ZeAAVSFbXEW8IkxJsVYfgE+Ba4KRN4gsg7rH527ytVNH+04eBzrePEdsF6sucCXZcx7KpAP3IT1n/km4BDQ3ukdIA5si0RgJ1YL1vHsDm+L1h6X5cA/gDinn4cD26Kd/Rm5BwjHar1mATc6/Twc2BaP2vN2sm93BbYCTzr9PHywHcKBKOBy4Jh9PQq3natu80bb74GHgbrApVijrS6ocD0+DPsC1oH+f8caIhlr3zcUyPOYvz/wC1Bg/73c6Q3u4xeuUtsCeBurdZbncRnq9PNw4n3h8dhFhNboGm8/I5cAa7EaQFuAe5x+Dk5sC6z9hhOBNPuzsQN4kRAYeQTcan/+PS/tgd72823rNv85wEq7bm4Dbq7MevQAZUopFcL0sAZKKRXCtMgrpVQI0yKvlFIhTIu8UkqFMC3ySikVwrTIK6VUCNMir2oEEflcRGbY1xeJyGSHIylVI/jiAGVKBdp1QGFlZhSR8cANxpjufk2kVJDSIq9qHGNMjtMZlKoptLtGBR0RqS8iM0QkT0QyReQxj/tP6K4RketEZL19OskcEfleROJE5FZgHNDNPp2gsachIg/YjzkkIrtEZLp9rHPXMm+113+pfYrKQyLyPxHp4JFlgIj8aK97n4h8Zp/ZCRGpKyLPi8hO+5Rtq0Skn/+2nFIn0yKvgtELWEelvB7rQExnAH1Km1FEWmCdWGIm1sGr+nD8VHIfYB3nZBMQb19cJyQpxjpeezesQ9mei3UqRneRWAfIGgFcgHWQqKlu6+4P/Af4ButoiX2B7zn+uXobuNhefnc742ciUuNPkKNqDj12jQoqItIQ2AeMMPYJMuxpO4FPjTG3isgiINkYc6+InIl1lqH2xpjfSlneeCrRJ28X7PlAPWNMsd3ifxtINMZssucZCrwFRBljjIgsBdKNMUNKWd6pWAcWa2+M2eE2/VNgtzEmJE5GrYKftuRVsDkV61Cqy10TjDF5lH1KxHVY54VNFpGPReQuEWlW0UpE5A8i8o3dleI6EmJdoIXbbEdcBd62254nxr59BtYJt0tzJiBAit3tkyciecBA+zkqFRBa5FWNZowpwjoe9+VYJ50YCWwpr0tERNoBXwAbgcFYXS0j7Lvrus16zHN19t/KfG7C7PnPwTrZg+vS1W1dSvmdFnkVbLZiDY8sOderiDTA6tMulbEsN8Y8hVVUdwM32ncfxTp+ubuzsYr5X+3HbQZaViHrWqx9BmXdJ0ALY0yqxyUUT3GogpQOoVRBxRiTJyJvAs+LSBZWwf4bJxdqAETkfOCPwFdAJlYXShsgxZ4lDWhn993vwDpJxRasBs4YEZmH9Q9lTBXiPoe1IzUVeBerqF8OvG6M2Swic4AZIvIg8BPQBOtkINuMMfOqsD6lvKYteRWMHsI6sfkn9t9k4Icy5s3FOh/s51jF+0XgGWPMbPv+j4EFWH3nWcBNxpj1wP3AA1j/DG631+kVY8wCYBBwBVbL/XusETbF9iy3Ye28/Qfwq52xD3DSDmKl/EVH1yilVAjTlrxSSoUwLfJKKRXCtMgrpVQI0yKvlFIhTIu8UkqFMC3ySikVwrTIK6VUCNMir5RSIez/AdLz0P0GaHJRAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW8AAAEfCAYAAACOHkfVAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nOzdeXxMVx/H8c/JShJEEvsWu9qqatfaHlR5rF20te+lu6Kq1KMriqqitStK1VJaRYtaSuwaa0uFEFtCIiKrLOf5407SJBJJdJKbSX7v12tembn3ztzvTGZ+c+bcc+9VWmuEEELYFjuzAwghhMg6Kd5CCGGDpHgLIYQNkuIthBA2SIq3EELYICneQghhg/JF8VZK9VdK6XQubZPN9zY7a1YppfxTPZ9wpdQJpdRrSin1kI+Z3muV/OJt3WeSfZRSS5VS/mbnyAqzMyulOlr+z8PSmPemZV67HMxjc//D7OZgdoAc9hxwJdW0M8BxoClwPccTWccvwP8s1wsD/wVmAU7A9Id4vKbpTC8PfIvxel17iMc1y4fAF2aHsCVa681KqW+BqUqpn7XWVwCUUpWAj4HFWuttpobM5/Jb8fbVWp9PZ97NHE0CKKWctdYxVnioW1rrA8lu/6qUqg88z0MU71SPBYBSygmjAIYDz2mt7z1s2JymtfYzO4ONegNoD8wDOlmmLQDuAG+bFUoY8kW3SUbS6jZRSrkopb5SSgVbuiJ+UEo1syzXP9lyu5RSu9J4TH+l1NI01tFCKbVGKRUKHLTMc1BKvauU+kspFaOUuqaUmq6UKvAvnlYY4Pgv7p/aDKARMEBrfTH5DKVUB6XUfqVUlFLqjlJqg1KqeqpllFLqLaXUWaXUPaXUdaXUbKVU4VTLaaXUR0qpt5VSl5RSkUqpn5VSxS2X7y3rCFBKvZOZ4Kl/ciulvBO7BJRSH1iyhCqlflJKlc3E42X1ubyulLqolLqrlNqtlKqVmdxprLeUUmqZUuqW5X1yQinVO9l8L6VUQqppnS05ViSb5mLJ/cqD1qe1DgZeBzoqpXorpYYAbYARWuvQDLKeVkqtT2N6I0ue7pbbVZRSyy2vT5RS6oLlc1c0g8dvZXmcVqmmp9kFqpQaqpQ6rpSKtrx+i5RSHqmWeUMp9aclx22l1JHEnLmS1jrPX4D+gAaqY/zaSLzYp5rvnew+K4AY4F2gHTAZuGhZrn+y5XYBu9JYpz+wNI0MAcBUoC3QwTLvOyACeN8y/TUgFFiXiefmj9GVkficigJ9gVjgnVTLLjX+5Vl+/V60ZJ+WxrwOQDywDegCvAScx/glUybZcp9YHmM28BTwFkYr/nfALtlyGrgE/IzR2huI8UW0FdgHjLe8RvMsy3bMRP6lgH+y296W+/oDK4GngX7ArbT+l2k8Xlaeiz9Gt1YX4FnLe+g84JDFzK7AOcvrOtSS+VvLOoYmW+4ERpdG4u3PgUjgarJpT1nu90gm//8/AsGW9+T3mbzPWCAaKJpq+peWx3Ky3G5heT27Wq73tzzP/Rm8Hq0sz6FVOp/15J/lyRifh+kYvyQGAFcxGk+JNaAXEIfxGWwNdLQ8h0H/tv5k18X0ADnyJP/5h6a+7E3rH45R5BOAMakeZxb/vnh/nmq5Jy3T+6aa3ssyvV4Gz80/nec2H1Cpll0ExGXxtXsEuItROO8rOMAR4O/k84CKlg/LDMttD4wvwqWp7tvbkrVLsmna8uFN/ngzLNPHJ5vmAAQBSzLxHFJ/8L0tj7cr1XKjLNNLP+Cxsvpc/gYck0171jK9WRYzv0raxWq75XVILEJfABeTzffFKFoaqG6ZNhm4noX3QHXL/SOB4pm8TzmML/VhyaY5Ynz5zH3A/RyAJyzre+wBr0erdF6P/qT8LHtbcryfarnmluW6WW7PBo5l5bNh9iW/dZt0BxomuwxKZ7nGgALWpJq+1goZfkh1uwNwD1hr6T5xUEo5AL9a5rfIxGNu4Z/n1BIYDbyA8YZMorUepLXO9HYOpZQrsA6jBdVTax2Xxvz6wOrk87TRrbLPkgWgCcbG0xWk9B1Ga6dlqunbUq3rL8vfX5KtIw6jBVsuWR6HVK9hRjanun3S8rf8A+7zMM8lNovrSEsLjNbzrlTTVwDFgJqW278B3kqpikopT6AusBzjC7GNZZk2GI2OzEr84ihI+huzU9BaB1jW0SfZ5A6AlyUPYGxLUUqNs3QZRmF86f9umZ2i6+0htcPoHv421XvjIEajJPHzdRiop5T6Uhkj0FyssO5sld82WJ7S6W+wTK6U5W9QqumBVsiQekRLcYxiEJHO8p6ZeMwQrfWRZLf3KKUUxkiBOVrrMw+RE4yNUzWAp7VltEEqRTG+5NIapXMDqGC5nti3mGI5rXWcUio42fxEt1PdvveA6QXA6MfG6JJIopSqqLX2TyNbopBUtxM3Hj9oW0NWn8vDrCO99ab3OifPtQfjV2NrjA2LtzFGB+0EWitjBEl9jP9thpRSzYARGN2HHYC5SqmdWuuwTNx9ObDE8n+4iFHIz2ut9ydb5lOMbsIPAB+MgloWWE/WX6O0FLf8Te9zn/j5WmZZ3yCM5xurlNoMjMzgPWSa/Fa8MyvxQ1KclAWhRBrLRmMMz0st9Yc4kU51O9jyGE+ms/zDDsk7bflbB2M4ZJZYNma9CHyktf4lncVuYzyfkmnMK8k/hSsk2bTEXFhaQJ7cX+AexjWMXx6pp1lbTjyX9NabVku0ZLL5aK1vK6V8MVrXdzC6hrRS6jeMX2KtAHuMYv5ASilnjK42X2Aaxq+wkxjbbF7OROZ1wBygt1JqFtAZo1gn9wKwTGv9UbL1umXisaMtf51STU/d2Am2/G3P/V/+SfO10XcyD5hn2VjaHqO7aTXGL/FcJ791m2TWIYyi9Fyq6alvg7FxrZoyhtIBoJRqARTK5Lq2YnzjF9FaH0nj8rAFqK7lb5aHQCqlGmL0M/8GTExvOa11BHAUeE4pZZ/s/hWAZvzz0/wARiv5hVQP0ROjAbGLf0lrfS+N1y47hjNm+3NJx26grFKqearpL2H8Qkz+Bf0bRsu7teU6GMXaC2P0SEAmf4G+D1QBBmut4y33+R8w1PIefyCt9V1gA8b2gGcBZ+7vbnLB6CpJbkAmsl2y/K2danqnVLe3YfwSKZ/O5+tiquXRWt/WWq8Gvk/j8XMNaXmnQWv9l1JqJfChUsoOo0C1wWg5gPFmSPQdxtb/xcoYGlgRGInR6snMunYppVZh9HnPwPjiSMDY0NIRY8TIuQwexksp1cRyvSBGS+E9jJ/LexIXUkotAvo9qN/b0upYY8kwC2ik0t5R84zlp/MEjJEhm5RScwE3YBLG859ueY4hSqnpwLtKqQiMvuZHgI+AvZb72wQTn8tSjHHX65VS72HsbNYLo093mNY6PtmyOzE2vpa2XEdrfVMpdRr4D0YXwQMppR4FxmCMMPoj2azpGF9U85VSj+qM91NYjvEFMwnYp7W+kGr+VqCfUuokRtdGD4wv/gfSWl9XSu3G+D/cwvgC6w1USrWcn1JqCjBbGcNXd2O02sthvHYLtdY7lVLzMbps9lseqxpGN8+v5FZmbzHNiQv/bIGuksF872TTXICvMH6OhmMMl+pkWa5rqvsPwxhVEIXRb/c46Y82uS8Dxi+gNzCKbTRG4TuO8fO0SAbPzZ+Uo0yigbOW+3qkWnYpGQwV5J+t+BldWiW7TweMN32UJftGLCMbki2jMIbUncVouV7H+EldONVyGqOrJsP/H0Yrd28m/v9LSXu0yeB0nnurDB7v3zyXxHX3z0pmy7RSGMXwFkbf+Qmgdxr3LYTRmr2eavoXmVy3PcYoonNAgTTm18fYOPtxJl57e8vro0k2pDHZfC+MBtBty+VbjO6vFDnTeT3KAj9hDGG8gTHkcDCpPsuWZftg/GqKwPg8/4nRjVTWMr+f5f0UZHltL2IMsyyc0XM066IswUUmKKVGYRRFb631ZbPzCCHyL+k2SYdS6r8Y/V2+GF0IT2L8FP1eCrcQwmxSvNN3F+iGsZeVK8YeWbN4wAY8IYTIKdJtIoQQNkiGCgohhA3KkW4TLy8v7e3tnROrEkKIPOPo0aO3tNbF0pqXI8Xb29ubI0eOZLygsHkBAQEAlCtXLoMlhRAZUUpdSm+ebLAUVtWnj3Ecol27dpkbRIg8Toq3sKrx48ebHUGIfEGKt7Cqtm3bmh1BiHxBRpsIq7pw4QIXLqQ+fIUQwtqk5S2sauDAgYD0eQuR3aR4C6uaNGmS2RGEyBekeAuratky9VnAhBDZQfq8hVWdPXuWs2fPmh1DCJsXGPjgsy5Ky1tY1bBhwwDp8xbi30rcZyI9UryFVX3yySdmRxDC5l26dInt27c/cBkp3sKqmjXL8AxWQogMLFmyJMNlpM9bWNWpU6c4deqU2TGEsFnx8fEsXryYdu3aPXA5Kd7Cql599VVeffVVs2MIYbO2b99OQEAAgwYNeuBy0m0irOqzzz4zO4IQNm3RokV4enrStWvXBy4nxVtYVcOGDc2OIITNunXrFhs2bGDEiBE4Ozs/cFnpNhFW5evri6+vr9kxhLBJK1asIDY2NsMuE5CWt7CyN998E5Bx3kJkldaahQsX0qhRI+rUqcNG36sPXF6Kt7CqmTNnmh1BCJt06NAhTp8+zbx589h88jojvz/+wOWleAurqlevntkRhLBJixYtwsXFhRL12/H6qj94rJw7Dzq4svR5C6s6fPgwhw8fNjuGEDYlIiKC7777jja9XmX0D39Rq0wRlgx48MZ/aXkLqxo9ejQgfd5CZMWaNWuI9ajEX14tqFbCjWUDGlGogOMD7yPFW1jV7NmzzY4ghM2Zu+YXSjz7PpWKF2L5oMYUcXlw4QYp3sLKateubXYEIWzKD3t8uVGtBx5OsGJQYzxcnTJ1P+nzFlbl4+ODj4+P2TGEsAmnrt5hzM+XiI8MZfnABhQr9OAdc5KTlrewqnHjxgHS5y1ERs4F3qX3wgPEhIfyaPAu6lQZkqX7S/EWVjVv3jyzIwiR6124Gc5LCw5yLzqK6yvHsmrT2iw/hhRvYVXVq1c3O4IQuVpASCS9Fh4kPiGe6yvfpWOLRjRv3jzLjyPFW1jV7t27ATkRsRBpuXEnmpcWHiDyXjwNwnw4ceUcn2zOeqsbpHgLK5s4cSIgfd5CpHbzbgwvLTzA7YhYpv23PF2fmEy/fv2oVavWQz2eFG9hVYsXLzY7ghCm0Vrj5+dHlSpVUky/HXGPPosOcj00mmWDGjHnfyNRSjFp0qSHXpcMFRRWValSJSpVqmR2DCFMsWDBAqpWrcqECRPQWgMQFh1L38WHuHArggV9G1Aw/BrLli3j1VdfpVy5cg+9Lml5C6tKPON127ZtTU4iRM7SWjNnzhycnZ356KOPCA0N5dPPpjNwyRH+uhHGvD6P80RVL7p0GUjhwoV59913/9X6pHgLq/roo48AKd4i/zl8+DAnTpxg7ty5XLx4kWmff8Ee+3rcLViS2S/Vp02NEuzdu5effvqJTz75BE9Pz3+1PinewqqWL19udgQhTDF//nxcXFzo1asXzgVdOeD0GJdi3SgXsIPWVdqgteadd96hVKlSvPHGG/96fVK8hVX9mz48IWxVWFgYq1at4sUXX8TF1Y3XVv3B5bjCPO0VwvzPZvL0lWMMGjQIHx8f5s2bh4uLy79epxRvYVXLli0DoFWrVhkuW758+WxOI0TOWLlyJZGRkQwZOpTRa0+w5dQNJvy3JoOeqEjLsg707duX3bt3U61aNQYOHGiVdUrxFlbVr18/AOzsMh7IFB8fn91xhMh2WmvmzZvHo/Xq8fN1F3744xKjn6rOoCcqAvDiiy9SpEgRBg8ezIwZM3BwsE7ZleItrGrr1q0AeHl5ce7cOcaMGcPLL79M06ZNAdi/fz/z5s1jypQpZsYUwmqOHj2Kr68vvT9dwbL9lxjaohKvtE45zrtjx45cvXoVpZTV1ivFW1jVU089lXR95MiRfP755zz77LNJ09q0aUP16tX54osvePHFF82IKIRVzZ8/H48G/+X3UHd6PFaGsR1qpLmcNQs3yE46wsp++uknfvrpJ8A4G3bdunXvW6Zu3bocPXo0p6MJYXV3795lrc9ZCrUZRstqxZjybF3s7KxbpNMjxVtY1fTp05k+fToA3t7ezJ07975l5s6dS4UKFXI6mhBWN3XxWtyeeoPKHo7M7VUfR/ucK6nSbSKsau3af46Q9vnnn9O9e3e2bt1KkyZNADh48CD+/v6sX7/erIhCWMW5wLusuFwIh5gwVo/oiKtzzpZTaXkLq/Ly8sLLywuADh06cO7cOXr06EFYWBhhYWH06NGDc+fO8fTTT5ucVIiMTZ48mb59+3LmzJkU06+GRvHC13uJjY5kcNUYvAoVyPFsKvHgKdmpQYMG+siRI9m+HmG+xBZ1jx49TE4ixL8TEhJC6dKliYmJQSlFz549mTBhAt5VqvHMV/v5+1owgSvHEnDyAO7u7tmSQSl1VGvdIK150m0irOrjjz8GjP7ujNSvXz+b0wjx8JYvX05MTAw7duxg+/btzJo1i9WrV/PYK18S4urNnU3TeLZt02wr3BmRlrewKjs7O5RSZPS+UkrJTjoi19JaU7t2bQoVKsSBAwcAuHXrFoOmreY43tzetYSwg+vw8fFJ2ochO0jLW+SYixcvmh1BiH/Nx8eHM2fOsGjRoqRpJ4MTOKG8aV/Ng6JUIeyxl5M2xJtBirewqsRWSs+ePU1OIsTDmz9/PoUKFUp6H1+4Gc7rq/7gkZKF+aJXIwo6ZV9rO7OkeAur+uqrr4B/indgYCBz5szhzJkzKKWoWbMmI0aMoESJEmbGFCJdt2/f5vvvv2fAgAG4urpyNzqWocuP4mhvx/y+j1PQyd7siIAMFRRWtnnzZjZv3gzAvn37qFKlCitXrqRgwYIUKFCAb7/9lqpVq7J//36TkwqRthUrVhAdHc3QoUNJSNC8tdqXi7cimPNSfcoW/feHcrUW2WApsk3Tpk2pU6cOX3/9ddJRBhMSEnj55Zc5deoUPj4+JicUIiWtNXXr1qVgwYIcOnSIz7ed44sdf/O/zjXp37xijueRDZYix6xYsQKA3r174+vry9KlS1McHtbOzo6RI0fy2GOPmRVRiHQdOHCAU6dOsWDBAg5eCGbWb3/To34Z+jXzNjvafaTbRFjVwoULWbhwIQBFihRJc/TJxYsXTRsbK8SDzJs3Dzc3N57u+gxvrfbF29OVD7vWtvoRAa1BWt7CqrZt25Z0/YUXXmDQoEFMnTqVZs2aAUY/+DvvvCOHgxW5zu3bt1m9ejV9+/Xjk20XCbobw7rhzXL8mCWZlTtTCZvl6OiYdH3q1KlorRk4cCBxcXFJ84cPH87kyZPNiihEmr799luio6Op2q43sw9fZ/RT1Xm0XO79hSgbLIVVLV26FID+/fsnTYuMjMTPzw+AypUrW+Xkq0JYk9aaRx99FPsiJYlpPZI6ZYqwckgT7HPo2NzpkQ2WIsekVbxdXFyoU6eOOYGEyISDBw9y8vQZmoyfRIKdHZ/3rGd64c6IFG9hVbt27Uq6Hh0dzRdffMGOHTsICgoiISEhxbInTpzI4XRCpG3+/PkUa92P6zFOzO1Vh9LuBc2OlCEp3iLbjBgxgh9++IHnnnuOZs2a5cot9kIEBwez7vcTFO3xP55vUJaOdUqZHSlTpHgLq1qwYAEAQ4YMYcOGDaxZs4a2bduanEqI9H0xdx6F2r1C6cKOTOxcy+w4mSbjvIVVrV69mtWrVwNGX3e5cuVMTiRE+qKjo1l8JBiHQsWY3adRrh0WmBYp3sKqtm/fzvbt2wEYM2YMM2bMyPDY3kKY5dOFa3B4pA1tyztQv3xRs+Nkie18zQib0KVLlxS39+zZw9atW6lZs2aKMeAAP/74Y05GEyKF6HtxLPsrHmUXwsxBL5gdJ8ukeAurunnzJgA1atQAoHv37mbGESJdby78Be1WjL7lw3Ar4JjxHXIZKd7CqhKPWbJkyRKTkwiRvtPX7rDVP56EiwcZ/8FEs+M8FCnewqq2bNmSdD1xXHfiUQVv3LjBpk2bqFmzZtKxToTIaXHxCby2/CDxUWEMrl8UJycnsyM9FCneItt06tSJDh068MYbbxAeHk6DBg2IiIggPDycRYsW0bdvX7Mjinxo0d6LXLgdS8Sepby+a43ZcR6ajDYRVvXFF1/wxRdfAHDkyBHatGkDwPr16ylcuDBBQUEsWLCAadOmmRlT5FP+tyKY/utZos4fpHerWhQtalsjTJKT4i2saseOHezYsQOA8PDwpD7wX3/9le7du+Po6EibNm2SDlQlRE7RWvPu+pPouFhub/uKt9580+xI/4oUb2FVP/74Y9IQwPLly7Nv3z4iIiL45ZdfaNeuHQAhISFyZEGR43744yr7LwRzZ89SurZvRcWKOX9aM2uSPm+RbUaOHEmfPn1wc3OjQoUKtGjRAjDGfstRBkVOuhsdy6db/qKkYzQHD2xk1AHbPwG2FG9hVYl92aNGjWLYsGE8/vjjBAQE0K5du6RRJ5UrV+bDDz80M6bIZ2bt+Jtbd2PQ22bRvHkzGjdubHakf02Kt7Cq/ftTtmgaNGhAgwYpjyXfqVOnnIwk8rmz1++w6PcL3Du7h+t//M7XmzebHckqpM9bWNW6detYt25d0u25c+dSq1YtXFxcuHDhAgBTpkzh+++/NyuiyEd27NhBx/FLiIuOwDvsBEeOHKFDhw5mx7IKKd4i28ycOZOPPvqIoUOHpjg4VenSpZk9e7aJyURe5+fnR/fu3ek8fALxxarSvaoTe7dvoX79+mZHsxop3sKqJk+enHRy4a+//poFCxbwxhtv4ODwTw9d/fr1OX36tFkRRR63f/9+atasybadu6n0zGiqFXdj+std89zJQKR4C6vy9fXF19cXgEuXLlG7du37lnF0dCQqKiqno4l8Ys6cObi6ujJm8TbCtRMfdKuNg33eK3WywVJY1XfffZd0vVKlShw7dowKFSqkWGbz5s3UrFkzp6OJfCA8PJwffviBZ/oN5dtjQXR+tDRNKnmaHStbSPEW2WbUqFG8+uqrREZGorVm//79LF++nKlTp7J48WKz44k8aOPGjURGRhJRtQN2IZpxHWuYHSnbSPEWVpU4fnvChAkMGDCAuLg4xo0bR2RkJH369KF06dLMmjWLnj17mpxU5EUrVqygQqP2HA2MY/RT1SlVJPefBf5hSfEWVnX27FkA4uLimD9/Pt26dWPIkCHcunWLhIQEihcvbnJCkVcFBgby66/bqDv6W1yLFmTwk7a9+3tGpHgLq1qxYkXS9dGjRyftkOPl5WVWJJFPrF69mgLVm3MbN95vXw1nB3uzI2WrvLcJVuQaTZo04ejRo2bHEPnE8m9XUqLtYB4pVZiuj5YxO062k5a3sKr3338fgA8++IAhQ4YwatQoLl++zOOPP46rq2uKZfPSDhPCXOfOneNsXDE8XDwY06E6dnZ5a0x3WjJdvJVSDlrruOwMI2xfQEBA0vWXXnoJMI4umJpSivj4+BzLJWxXdHQ0WmsKFkx/4+PSFaso0qwn9cu40apasRxMZ56stLyvK6W+ARZprf/MrkDCtiU/8fDFixdNTCLyip49e3L+/Hl8fHwoUqTIffO11qz6Iwj7Wg2Y0LVuntuTMj1ZKd7jgAHAW0qpQ8BCYLXWOjxbkgmbl3rnHCEexh9//EFAQAC9evVi48aN2Nun3BD56579xFdpRa1CMTxW3nZPa5ZVmS7eWusFwAKl1CPAQOAjYKZSag1Ga3xfNmUUNuTdd98F4NNPPwXgypUr7Nmzh6CgoKSzySdKqztFiOSio6O5cuUKtWvX5ueff2bcuHFMmTIlxTKfbvwD5ViOyS81MimlObK8wdLSZTJaKTUWGAF8BvRTSv0NzATma60THvQYIu8KDg5Ouv7tt98ycOBAHBwcKFasWIqfs0opKd4iQxcvXkRrzdixY9m3bx9Tp06lTp069O7dGwC/wDtctC9LiXA/Hq3YxeS0OSvLxVsp5QT0wGh9twH2AouA0sAEoBXwgvUiClsyf/78pOvvv/8+b7/9Nh9++OF9P3WFyIzz588DUKVKFZ5//nn+/PNPBg8eTLVq1WjUqBHvrNiLTojj9TZVTE6a8zI9zlspVV8pNRu4jtHC9gVqaq1baa2Xa62nAO2BrtkTVdiawMBABg8eLIVbPDQ/Pz/AOHWeo6Mja9asoVSpUnTr1o2dvuc5chPiTm/j+S5PmZw052VlJ53DQGVgKFBWaz1Ga30u1TL+wHep7yjyj1GjRjFq1CgAOnbsyMGDB01OJGyZn58fhQsXxtPTODKgl5cXP/74I2FhYQz7ciMJ0eF0quSEk5OTyUlzXla6TSpprS89aAGtdQTGiBSRT506dQqA9evX065dO9555x1Onz5NnTp1cHR0TLFsjx49zIgobMj58+epUqVKiu0lderUYfK8FUw76UjYnuUMmPa6iQnNk5XivVMp1VBrHZx8olLKHTimta5k3WjCFv36668p/gJ88skn9y0nO+mIzPDz86NevXr3TT+ZUBZndY1WZRVNmzY1IZn5stJt4g2k1XnpDOT9AwmITElISMjURQq3yEhcXBz+/v5Urlw5xfQ/r4fxy+lAXm5Tgw3fr8o3O+WklmHxVkr1UEol/r7tlHjbcnkOmITR1y0Eb775Jm+++SYAy5YtIyYm5r5l7t27x7Jly3I6mrAxAQEBxMbGUqVKypEks387j5uzAwOb5+1DvmYkMy3vtZaLxhgSuDbZZQXQGng7uwIK2zVgwADu3Llz3/S7d+8yYIBsGhEPlnykSaJzgXfZfOo6/Zt5U8TFMb275gsZ9nlrre0AlFIXgYZa61vZnkrYrJkzZyZd11qn+ZP28uXLaR6jQojkko/xTjT7t/MUdLRn0BP5u9UNWds9Xl4tkSl16tRBKYVSipYtW+Lg8M/bLD4+nkuXLtGxY0cTEwpb4Ofnh7OzM6VLlwbgfFA4P524xtAWlSjqmv+GBqb2wOKtlBoJzNVaR1uup0trPcOqyYRNeuWVVyhYsCCdOnXi1KlTdOrUCTc3t6T5Tk5OeHt788wzz5iYUpZ5MfsAACAASURBVNgCPz8/KlWqhJ2d0bs7d+d5nB3sGPKkDGyDjFverwHfANGW6+nRgBRvQcGCBWnRogUTJ07E29ubF154AWdnZ7NjCRuUOMYbwP9WBBt8rzKweUW83OT9BBkU7+RdJdJtIjJj2rRpSdf79etnYhJhy7TWXLhwgf/85z8AzNl5Hkd7O4a2kFZ3on91DkulVP7e3CuEyBaBgYFERERQuXJlAkIiWf/HVV5sVJ7ihQuYHS3XyMqBqV5XSj2T7PZiIEopdVYpVT1b0gmbM3ToUIYOHWp2DGHjko80mbvrPPZK8XLLyhncK3/JSsv7deAmgFKqBfAc8BLG0QWnWz+asEWenp5JBxES4mEljvEuXLI8a49e4fmGZSlZRFrdyWXl2CZlgMSTEnYG1mitv1dKnQR+t3oyYZMSz6AjxL9x/vx57Ozs+MU/ngQNw1pIqzu1rBTvMKA4EAC0wziDDkAsIF+J4j4ffPBBmtOVUhQoUIAqVarQoUOHB54VXORPfn5+lK9Wi9VHr9CtXhnKebiYHSnXyUrx/hXjHJbHgCrAFsv0WvzTIhf5XOJu70uWLGHNmjVcvnyZiIiIpB0trl27hqurK8WKFSMgIIDixYuze/duKlWSUQTiH35+frg36k5oXALDW0mrOy1Z6fN+BdgHFAOe1VqHWKbXB1ZZO5iwTeXKlaNcuXIAvP322zRs2BB/f38uX77M5cuX8ff3p3Hjxrz//vtcu3aNatWqybksxX3OX77G3ZKP8XTtklQp7pbxHfIhpbXO9pU0aNBAHzlyJNvXI3KXihUrsnHjRurWrZtiuq+vL926dcPf358DBw7QtWtXAgMDTUopcpvbt2/j3ellirboy6bXnqB2mfx7HByl1FGtdYO05j3MCYhLY/R9p2i1a62PPVw8kVcFBgYSHR193/SYmBiCgoIAKFGiBJGRkTkdTeRip8+ep3CDrjzinpCvC3dGsjLO+zGl1GmMDZbHgCPJLoezJ56wNb1796Z3794AtG3blmHDhnH48OGkkzAcPnyY4cOH065dOwBOnjxJxYqy8674x3eHA7B3KcKARqXMjpKrZaXlPR+jcA8BrmEcz0SIFKpX/2d/rYULF9K3b18aN26cdAb5hIQE2rdvz4IFCwAoVKhQil3qRf4WExfPb9fsiL58kk6NW5kdJ1fLdJ+3UioCeCyNM8ZnSPq887ezZ89y9uxZAGrUqEG1atVMTiRyq5UHLzPuh5Pc+3UG147tMDuO6azV530SKAlkuXiL/K169eopWuRCpCUuPoGvd/vhFH6dsq73zI6T62WleI8DpiqlxmMU8tjkM5MNHRT52AsvvADAd999B8Dq1avZsWMHQUFBJCQkpFj2xx9/zPF8Ivf66cQ1LodEEn/kB6rUkLHdGclK8d5u+fsrKfu7leV2WmeWF/lMvXr1kq6PHj2amTNn0rp1a0qXLp1vz/ItMhafoJm704+qxV3ZfnALlTtNMjtSrpeV4t0621KIPGPs2LFJ15ctW8aqVat49tlnTUwkbMFPx6/xd1A47zxZjO3o+84YL+6XlXNY7s7OICLvSUhISNESFyItsfEJzNh2jkdKFaZk7HUg5RnjRdqydDIGpVQdpdRspdQWpVQpy7RuSqnHsieesDXPPPNM0vkphw4dyooVK0xOJHK7NUeucDkkklHtq3HxgnEoWCneGct0y1sp1R74EeOAVG2AxEPBVQb6A92sHU7YnqZNmyZdDw0NZeXKlWzbto26devi6JjyxEuzZs3K6Xgil4mOjWfWjr+pX96dNjWKs362H+7u7nh4eJgdLdfLSp/3h8BIrfVcpdTdZNN3AW9bNZWwWaNGjUq6fubMmaRuk7/++ivFcrLxUgCsOHCJG2HRzOj5KEopzp8/T+XKleX9kQlZKd61gc1pTA8B5GtS3Gfnzp1mRxC5WHhMHF/t8uOJKl40q+wFGIeCbdAgzX1SRCpZ6fMOwTibTmr1gSvWiSNsXZcuXejSpYvZMYQNWLL3IsER9xj1lLEDV2xsLJcuXZL+7kzKSst7JfCZUup5jHHdDkqplsA0YEl2hBO258KFC/Tp0wcgwyIuO+nkX3ciY5n/+wXa1SxBvXLuAFy+fJm4uDgp3pmUleI9HlgKXMLYMecMRsv9W+BjqycTNqlhw4aMGDECAA8PD+m7FGmat8eP8Jg43m7/z3FuEk86LGO8Mycr47xjgV5KqQkYXSV2wB9a67+zK5ywPUuW/PMjbO7cuTg7OycdUVAIgKC70SzZ50+XR0tTo2ThpOnnz58HZJhgZj2weCulFmdw/w6JLSut9UBrhRK26+mnnwZg06ZNFClShOPHj1OzZk2TU4ncZO5OP+7FJ/BW25RHl/Tz86NAgQKUKiXH8c6MjFrexVLdbgEkYByYCowRKHbAHivnEjaqc+fOANjb21OhQgXu3ZOjw4l/XAqOYOXByzzfoCzeXq4p5vn5+VG5cmXs7LK072C+9cDirbXunHhdKfUuEAUM0FpHWKa5Aov4p5iLfC6xvxtgwoQJjB07lhUrVuDl5WViKpEbaK0Zv+EUTg52vPGf+4/pnjjGW2ROVjZYvg78J7FwA2itI5RSHwI7kI2WIpVp06Zx8eJFypQpQ9myZXF1TdnSOnHihEnJhBk2+l7j979v8UHXWpQsUiBp+sGDB5k0aRKnT59OOrSCyFhWircbUBpjlElypQAXqyUSNq1t27YAbN++XY4mKJLcjrjHB5vOUK+cO70aVwDg0KFD/O9//2PLli14enoyefJkXn/9dZOT2o6sFO91wBKl1GjggGVaE2AKsN7awYRt6tmzZ9L1iRMnmphE5CafbP6TsKhYPu1Rhz+OHWXixIls3rwZT09PPv30U1555RUKFSpkdkybkpXiPRyYjjHWO/EIQ3EYfd6j0rmPyGeGDBmS4nZ0dDSbNm3Cz8+PYcOG4e7ujp+fH0WLFpWDD+UTPn63WHP0CsNbVcYu7DpNmzalcOHCUrT/payM844CRlha3olbFfyS94ELkdz58+dp27Yt4eHhhIaG8txzz+Hu7s5XX31FaGgoCxcuNDuiyGbRsfG898Mpynu48MZ/qvLZ5E+Ij4/n+PHjlC1b1ux4Ni3LY3K01hFa6xOWixRukUKrVq1o1aoVAG+++Sbt27cnMDCQggULJi3TpUsXOWhVPjF353ku3org4+61KeBoz7p162jWrJkUbivISreJEBnq379/0nUfHx8OHDhw3x6W5cuX59q1azmcTOS0vwPv8tVuP7rVK82TVYvh5+fH8ePHmTFjhtnR8gQp3sKqkhdvMI4Ul9rly5cpUqRIDiUSZkhI0Ly7/iSuzg6M/6+xh+26desA6NGjh5nR8gzZlUlYVWxsbFLBbt++fYpWllKKsLAwJk6cSKdOncyKKHLAor0XOXLpNuM6PoKXmzNgFO8GDRpQoUIFk9PlDVK8hVW1a9eOdu3aATBjxgz27t1L9erViY6OpmfPnnh7e3Pjxg0mT55sclKRXXz8bjF56190qFWS5x43+rYDAgI4dOiQ7IRjRdJtIqxq8ODBSddLly6Nr68vq1at4tixYyQkJDB06FB69eqVYgOmsF0JCQlEREQkDfe7FhrFqyv/wNvThWnPP5p0SOD1641dQaR4W4/SWmf7Sho0aKCPHDmS7esRucuePXto1qwZDg4p2whxcXH4+PjQokULk5IJa4iJiaFr164cO3aMM2fO4FakKM/P28+FmxFsfLU5lYu5JS3bokULQkND5ZAIWaSUOqq1TvO8cNJtIqwqMjKSyMhIAFq3bk1ISMh9y9y5c4fWrVvndDRhRfHx8fTu3ZtffvmF4OBgJn3wAe9vPMWJK3eY8fyjKQr3jRs32Lt3r7S6rUy6TYRVdezYEYBdu3ahtU7zTDrBwcH3HaRK2A6tNS+//DJr165lxowZnD17lmX7/CjqcoXX2lShfa2SKZbfsGEDWmsp3lYmxVtY1fDhw/nss8/o0qULSil69+6Ns7Nz0vz4+HhOnTpFs2bNTEwp/o2xY8eycOFCxo8fz1tvvcX2P/zYUugUhcIDeLNtx/uWX7duHdWqVaNWrVompM27pNtEWFXPnj2pU6cOnp6eaK0pWrQonp6eSZeyZcvy8ssvs2LFCrOjiocwefJkpk6dyogRI/jggw8IuhvNe1suUtghnjOLRrP395TnZQkODmbnzp0888wzcj5TK5OWt7CqO3fuMHPmTIoUKYK3tzejRo2SLpI8Yt68ebz77ru89NJLfPnll4THxDH4myPciYpl5ZAn6LrCnbfffptDhw4lnQ1n48aNxMfHS5dJNpCWt7Cqrl270rVrV8A4k07yIYE3btxg4cKF+Pj4mBVPPKS1a9cyfPhwOnXqxNKlS7kXrxmy7AhnroUxt1d96lcqzieffMLRo0dZuXJl0v3WrVuHt7c39evXNzF9HqW1zvbL448/rkX+sG7dOr1u3TqttdYdOnTQM2fO1FprfffuXV2mTBnt7u6uHRwc9DfffGNmTJEFsbGxunTp0rphw4Y6IiJCx8bF60FLD2vvsZv0hj+uJC0XHx+vH3/8cV2uXDkdGRmpQ0NDtaOjox45cqSJ6W0bcESnU1el5S2sqkePHknHrjhy5Aht2rQBjJ00ChcuTFBQEAsWLGDatGlmxhRZsHnzZq5du8Z7771HgQIFGbPuBNv/DGRSl1p0rVcmaTk7OzumT59OQEAAn3/+OZs2bSI2Nla6TLKJFG9hVbdu3eLWrVsAhIeH4+7uDsCvv/5K9+7dcXR0pE2bNvj5+ZkZU2TB/PnzKVWqFB07duSDTWdYf+wqI9tVo29T7/uWbdmyJV27duXTTz9l/vz5lC5dmiZNmuR86HxAirewqmeffTbp3JXly5dn3759RERE8MsvvyQd8yQkJAQXFzntqS24fPkyW7ZsYdCgQczd7c9SH38GNq/Ia22qpHufqVOnEh0dzZ49e+jevXvSxkthXTLaRFjV22+/nXR95MiR9OnTBzc3NypUqJC0O/yePXuoU6eOWRFFFixevBitNe6NuvH59nM8U78s4zs98sBhf9WqVWP48OF8+eWX0mWSjeTYJiJbHTlyhICAANq1a4ebm7HL9M8//4y7uzvNmzc3OZ14kLi4OCpWrEiZlj25UbY17WuWYG6v+jjYZ9ySjoiI4Oeff+a5556T8d3/woOObSLFW1jVjRs3AChZsmQGS4rcbtOmTbw4fjaeHV7jPzWK81Xvx3FykC6QnPSg4i3dJsKqXnjhBcA4tklGp7saOXJkTkQSD2ny97vx7PAaLat6Mbd3fSncuYwUb2FVY8eOTbr+5ZdfppgXGxvL9evXKViwIMWLF5finYst2HaCgFItKM1t5vXtgLODfcZ3EjlKirewqg4dOiRdv3jx4n3zAwMDGTBgAEOGDMnJWCIL1h+7wsc7LhPtf5wlH71IAUcp3LmR/A4SVhUQEEBAQEC680uUKMHHH3/MmDFjcjCVyKyNvlcZteY4+sZZHg07QI2qlc2OJNIhLW9hVX369AGMPu/0JCQkEBgYmEOJRGZt9L3KW6t9qVwYdkx7j89Xr8z4TsI0UryFVY0fPz7peuJ5CxNprbl+/Tpz5szhySefzOlo4gESC3ejih5EbJlOcU93OnfubHYs8QBSvIVVtW3bNul64p6WiZRSFCtWjDZt2jB9+vScjibSseGPq4z83ijcH7UvQ40RGxgzZgyOjo5mRxMPIMVbWNWFCxcAqFSpEgkJCSanERlJLNyNK3qyqH8DZkydTHx8PIMHDzY7msiAFG9hVQMHDgQe3Octcocf/rjC298fTyrcF879xZw5c2jXrh2VKlUyO57IgBRvYVWJ5ynMaAcdkJ10zJS8cC/u35Aff1jLwIEDKVy4MFOmTDE7nsgEKd7CqjZv3pyp5ZRSUryzWUJCAseOHePRRx9N0X+dvHDP7/0Y498dw4wZM2jevDlr1qyhVKlSJqYWmSXFW1jV1q1bAahevbrJScSsWbN466238PLy4oUXXqB3795cdSzD22uO07SSJ5M7VqBLpw7s2rWL1157jWnTpuHk5GR2bJFJUryFVQ0bNgyQPm+z3bt3j2nTpvHYY49RpUoVFixYwJKdp/Hq9Bal7MPp7hlHs8bPEhwczLJly5LG5wvbIXtYCqv65JNP6Ny5M97e3oSFhd03/86dO3h7e7Nt2zYT0uUfK1eu5OrVq3z88cd8//33LNx+Eq//jqRgWACHpg3k+We64ejoiI+PjxRuGyWHhBVW16lTJzp27Mgrr7yS5vyvvvqKTZs28fPPP+dwsvwhISGB2rVr4+joiK+vL+uPXWXU2uM0q+zJwr4NCQ66zu+//85TTz2Fh4eH2XHFAzzokLDS8hZWderUKY4ePZpiZ53U2rRpw/Hjx3MwVf6yadMm/vzzT8aMGZNUuJtX9mJh34YUdLKnbNmyvPjii1K4bZwUb2FVr776KkFBQQ88b6FSiuDg4BxMlb9MmTKFChUq4FTtyaTCvaBvAwo6ydEB8xIp3sKqPvvsM8qUKcOJEyfSXebEiROUKVMmB1PlH3v37sXHx4eOwycyZv1JmlX2lMKdR0nxFlbVsGFDevTowYQJE4iKirpvfmRkJO+//z6dOnUyIV3eN2XKFEo0fJqtocVpXNEjqatE5D0yVFBYla+vL127dmXt2rVUq1aNV199lRo1agDw559/Mnv2bLTWjBs3zuSkec+pU6f47e/bFO82lgYVPFjcXwp3XibFW1jVm2++CYCPjw/Dhw9n3LhxJI5oUkrx1FNPMWfOHEqUKGFmzDzpnVkr8eo8mrplCrNkQENcnOTjnZfJf1dY1cyZMwGoUKECmzdv5vbt25w/fx6tNVWrVqVo0aImJ8ybVu4+xakijfFU4Xw7tBOuzvLRzuvkPyysql69eiluFy1alIYNG5qUJn/Y+VcQ722+SGzQBVaM74abFO58QTZYCqs6fPgwhw8fNjtGvrH371sMW36Eezcv0opT1Kxa0exIIofIV7SwqtGjRwNybJOccOhiCIO+OYxdZDA3Vo1n3GEfsyOJHCTFW1jV7NmzzY6QLxy7dJveC3yIDrlO0KpxfDppPLVr1zY7lshBUryFVUkByX47jp1jyKqTxISFUPrP1Wzet1Ne93xIirewKh8f46d7s2bNTE6S9yQkJPDBFwtZ7F8IHRvDsGoxvDf3F+ztZSx3fiTFW1hV4s430udtXaGhoXTtPYQL3l1xcrLjmwFNeaKenPAiP5PiLaxq3rx5ZkfIcy5evMjTz/UlovFgChUuzMY3WlO1RCGzYwmTSfEWViWnP7OuAwcO0PWlATg/PZYiHl6sGfGkFG4ByDhvYWW7d+9m9+7dZsfIE9auXct/Oj9LwY5jKexZnO9efoJHShU2O5bIJaTlLaxq4sSJgPR5/xtaa6ZNm8a7H0yhwsDPKVCkGCuHNqF2mSJmRxO5iBRvYVWLFy82O4LNe+ONN5izeDlVh87B3s2D5YMbU7esu9mxRC4jxVtYVaVKlcyOYNOOHj3KnEXLeGTE1yQUdGfpwEbULy8H8xL3kz7vZOLi4tiwYQM3b940O4rN2r59O9u3bzc7hs36cNoXlH5pMnHORVjcvyENveU8kyJtUrwt/P39adWqFd27d6dixYqMHTuWW7dumR3L5nz00Ud89NFHZsewSb//cYbDhVvg7Fmaxf0b0qSSp9mRRC4mxRtYtWoVjz76KCdPnmT27Nl06dKFqVOn4u3tzbvvvisny82C5cuXs3z5crNj2Jy/A+8yeNUZ7Au68XXPmjSr4mV2JJHL5evifffuXfr168dLL71ErVq18PX15ZVXXmHlypWcPn2azp07M2XKFLy9vXnvvfcICwszO3KuV65cOcqVK2d2DJtyPCCUZ77aR2RUNE/EHKHtY1XMjiRsQL4t3gcPHqRevXqsWLGC999/nz179lCx4j/HQn7kkUdYtWoVp06dolOnTnz66af07t3bxMS2YevWrWzdutXsGDbDx+8WLy04QHx0BIHfjmHSyGFmRxI2It8V7/j4eD7++GOaN29OXFwcu3btYtKkSTg4pD3wpmbNmnz33XdMmTKFn376iZ9++imHE9uWyZMnM3nyZLNj2IRfT9+g/5LDlCzszLVlb9P1P82pWrWq2bGEjVCJJ4fNTg0aNNBHjhzJ9vVkJCAggN69e7Nnzx5eeOEFvvrqK9zdMzd+NjY2lnr16hEZGcnp06dxcXHJ5rS26caNGwCULFnS5CS5l9aaBb9fYMrWs9QuU4QG4Qd5f+zbHD58mAYNGpgdT+QiSqmjWus03xT5Zpz3mjVrGDp0KHFxcXzzzTf06dMHpRQAMXHxXL0dxZXbUdwIi8bLzYnyHi6ULepCAUfjcJuOjo7MnTuXVq1a8cknn8iIinRI0X6wsOhYRq85zi+nA+lYpyQfdq5BnRrP0aZNGyncIkvyfPEODw/njTfeYPHixTRs2JDFy1ZwIcaNt78/zqWQSK7cjiQwLCbd+5co7EwFD1fKebjQqGJFevYdyGeffUbfvn2pVq1aDj4T25DYrdS5c2eTk+Q+f14PY/iKo1y5HcWE/9ZkYHNvFi9ezPXr11m6dKnZ8YSNydPdJj4+PvTv35/z5/0YMPYTijzanl/PBBEeE4eXmzNVirtSrqjRwi5btCDlPFwoUdiZW+H3CAiJ5HLiJTiSC7ciuBUeg72C6ICTlIkP5Of5n1KsUIEcf165WatWrQA5tklq649dYdwPJylcwJE5verT0NuDhIQEatasiYuLC0ePHk36JShEonzXbRIQEMA777zDms2/UerJ56j30tPsiNa4nQmiY52SdHusDE0qemJnl/aHpYKnK49XSLlLstaa09fC2HzyOiv3xhIUV4dGH2+nSSUvutYrTZd6pXFxypMvZ5asXbvW7Ai5SnRsPB9uOsO3By/TpJIHX75Yn2KFnAHYuHEjZ8+eZdWqVVK4RZblqZZ3ZGQkU6dOZfq8b3Bp0AOXWq2xt7OjZbVidHusDO1qlkjqw/434uLiqN+mC7fdKlC51XP4h0RRqIADzz5elt5NKlC5mBs3btwgNDSUGjVqWOGZCVujteaX0zf4cNOfXA2N4uWWlRnVvhoO9sYAr/j4eJo3b05QUBDnzp1Ld7STyN/yfMtba83KlSsZ+9E0oiq2xLPvLJwc7OnVpAIvt6xMicLW7dpwcHBgwWcTadq0Kc/XdOWzV8exfP8llu+/xJJ9/rjcvczlHd8Sc+Ewa75fTbdu3ay6/txs/fr1APTo0cPkJOY5F3iXST+dZt/5YGqULMSqIU1oWtnY1f327dssXryY2bNn4+/vz9dffy2FWzwUm3/XxMbG8t+e/TgaXQy3zpMo6mBP76be2VK0k2vcuDGDBw/mi5kzqVG9Ord27uT6LztxqPYkNOiMV7d3sYsJY+DnPxASAwN75o8CPmvWLCB/Fu87UbF8sf1vvtnvj5uzAx90rcVLjcrjYG/HmTNn+PLLL1m2bBmRkZG0aNGCadOm5cvXSViHTXebXA+N4tmJC7niZHxA+jSryPBWVbK1aCcXHBxM9erVCQ4OxsPDg+eff57evXvTuElTdp29yTf7/Pj9fAgaTa2i8GbnhrSpUTzpp3NedOfOHQCKFMk/Jw64ExXLdwf9mbPzPHdjEmjkGUvDAjeIvnOLkJAQjh8/zm+//YazszO9evXitddeo169embHFjbgQd0mNlm8QyLu8dWu8yz+3Y+4+HiqOwSz/N1eOVa0kzt27BjXrl2jffv2ODk53Tf/r8uBPDNmBmFetbB386REYWeeb1COrvVKU6W4nIvQlvndDOcbH3/WHA4gKi6B6EsnCPltIbFBF5KWcXNzo3Tp0vTr148hQ4ZQrFgxExMLW5NnindYdCwLf7/Iot8vEHkvjrsnd9DKK5I1S7/O1VvrQ0NDaffUU5wNc+SJ/mP5M1ShNdQoWYjOj5bmv3VLUcHT1eyYVrF69WoAevbsaXKS7JGQoNnz902W7PNn97mbOCiIOvs7Uce3MGXsq9SoUQMPDw88PT0pWrQozs7OZkcWNszmi3fQ3WiW7vNnxYFLhEXH0aCEPZunjKDxIxXYunWrTXxAQkNDad++Pb6+viz8dg1xpery88nrHL10G4C6ZYvw37ql+M8jJajk5Zqrv4weJC+M8z537hw9e/akZMmSvP7667Rr157jV8PYeuo6m0/e4GpoFMUKOeMde5kN00dRvUJp1q9fLzttCauz2eLtdzOcBXsusP7YVWITEniqZkme9rajf5c2lCpVin379lG0qO2cIiqxgB8+fBh7e3s8PDwoWqYSzlWbEVuqDlEFiwNQspAjrR8pRctqXjSr4kXhAo4mJ8+8yMhIAJs99suePXvo3r07dvb2OJV+hCivGhSu1QIKuuNgp2hRrRjtq3vw/fR3+WHdGp5//nkWLVqEm5ub2dFFHmRTxVtrzZFLt5m/5wLbzgTi7GDHs4+XZfCTlSgYd5cmTZoQHR3NgQMH8Pb2zt7g2eDOnTssW7aMGzduEBISQkhICMHBwYSEhHA9LIbwQhUoWLE+LpUeA4cC2CmoX96dppW9eLxCUWoUcyYu8m7SfStVqkT58uXNflo2T2vNF4tX8b+vVuH5SFPcKtfnTnQ8Dkpjf/MsV/dvwj7wDH1ffJ4dO3Zw7tw5pkyZwsiRI232V5LI/WyieJ8PustG32ts/OMql29H4aziqKKv4hVymvBgo9CdPXuWO3fusHv37jx5EB+tNadPn2bLli38vGUrhy/cwql8XVyrNMTeyxtlZ4/WCcTeCiDm6p/EXP2ThJsXGNH3WSa8916uGOGxYsUKgFx/7PO4+ATO3wznRMAd9l8I5pc/LhKJscG5uJsTT1QrRuvqxWldozhuzg4cOnSIL7/8ktWrV+Pu7s7q1atp3bq1yc9C5HW5tnhfDY3ip+PX+NH3Gmeuh6HQxF/7g3AfzwAAC/dJREFUkxDfbUT+tQdHpfH09MTDwyNpI9Drr7+ebz40YWFhbN++nW3bthF5Lx7l6U2ESwlClDvX7jkTFW+0+BJiY+DOdR6rWIzOLR6nVml3qpcsRFEXxxxvFebGPu+70bFcvBXB6WthnLp6h1PXwvjrehgxcQkAOMZHE3ruEI+VdmX2+69RtaR7uq9bSEgIzs7OuLrmjQ3MInfLFcX78OHDXLkdxWH/EA773+awfwjng8IBKOkYzZV9G7hx6GdaNKrHhAkTaNSoES4uLvKTNB0JCZrzN8M5eeUOu4//zS8HTxPpVBR713+OT+7m7EA5DxfKexSkvIcL5TyMA3AVdXFKuhQq4JDuMV4Arl27hqOjI15eXhn+L7TWXL9+ncDAQBwdrdNPX6FCBQoVun9IZVx8Anej4wiLjiUsKvFvLLfCYwi4HUVASCQBtyMJCIniTlRs0v0K2GuKO97DPSGMAtE38Tu8i2O7tvDBB5MYP368vN9ErmJ68S5RqaauNORLrt+JBqBQAQdqlyiIfYg/v30znet/n6Bly5b873//S2q5iazRWrN69fe8M/EjgmKdqN6wJQW9ypLgUpRoh0JEaGfi0jhxkr2dokhBR9xdHCngYI+DHURF3CU0JITgm4GEh90BNI6OTri6ueHi6kpBF1dcXFxJACIio4iMjiE6+h7RsbEkaIWyswNlXJJf50GFMel9qC3XNWjAzg5HR2ccnJywc3BE2dkTryE2Pv33rR0JOMdFkBAWRETQZe5e9yfuzg3uBfoRFxpoPDZQoEABSpQowccff0yvXr0e7oUXIhuZXrxdy1TTnUbPwunOZW79eZCTe3/h2tWrALRo0YJJkyZJ0baSqKgoZsyYwYYNG5I2hBp7PSrsXd2xL1Ic+wKFcHBzp1gZb4qWLIebRwnsCxbiZshtboWEEq8V9o6OFHL3oFARdxRwLyaGmOhooqOjiImJRmsNCQmQEE8BZ0dcCxbAzdWF2HsxODs6ULJEcRRGvbYDEhv3aZVvnfqvpW5rrQm7E8rNoCCCblwjOioSnRCPo70dBZ0ciAy9RdSdEBJiwkmIjiAhJoKEqDDiw2//v72zjbHqqsLw85aKoNiPCV9FCgwzY1pbg4PaqUGrUouASbWosS0mUtrEaDBiW2PVxNLUj7TSJkZ/qNFCY2kkFdpKbVVaKRLTTKytIFRhGPmQEpECnTJTikWWP/a+eLjcYebeOWfuOcx6kp1z797r7L3fubPXnLvPmbWYMOECWlpaaG5upqWlhalTpzJmzJgT228NDQ2MHDky2w/DcQZI3Z23pBODNDc309bWRltbGzNmzKC1tdW/qmbMsWPHOHToEAcOHGDfvn10dnbS0dHB9u3bTxx7enqYNGkSc+fOZc6cOcycObPXx9+OHj3Kjh07AGhsbDzpOfus9rzNjM7OTtrb22lvb6erq+uU+yENDQ2MGzeOpqYm35N2zgjy4LwPA1szHyh/jAZeqvck6oDrHloMRd2DpXmymVWMqTBYUQW39vbX40xG0rOue+jguocOedB85oa3cxzHOYNx5+04jlNABst5/2SQxskbrnto4bqHDnXXPCg3LB3HcZx08W0Tx3GcAuLO23Ecp4C483YcxykgNTlvScMkfU/SfkmHJa2SNPo09rMlbZF0RNJmSbPK2pslPSmpR9IeSbfUMq8syUCzSXpVUnei1D+maxnV6Jb0VkmPStoV9Z0SF1bSWEmrY1/7Jd0lKXcXERno3inptbLP+x3ZK6mOKnXPlfR7SS9JOiRpg6T3l9nkfm1DJrozX9+1LprbgI8BbcDEWPfzSoaSpgKrge8C58bjw5KmxPZhwBrgb8AY4Grgq5LylgQxNc0JZpnZqETpymLiA6TfuoHjwO+A64E9vdisiMeJsc9rgK+kMtN0SVs3wE1ln/dfU5ttelSj+3zgB0AzYe0+CDwh6UIo1NqGFHUnyHZ9m1nVBdgF3Jh430SIJTS5gu0dwIayug3A7fH1h4BXgVGJ9juBdbXMLauSpub43oD31VtXmrrLztsJfKasrjGe25SouxHYUW+dWeo+XX3eSq26E/b/AubF14VY22nrju8zX99VX3lLOg+YBPy5VGdmncArwLQKp0xL2kaeS9hOA7aZWXcv7XUnA80lHopfvdolzUtxyqlQg+6+mAZ0xT5KPAdMkXTOQOaaJhnoLnGvpIOS/iLpcwOcZuoMVHfcBhoNlL5R5H5tQya6S2S6vmvZNilFxi//CvAyUGkBvqUP277a80DamgE+TLgSnQjcC6yQNHvgU02VanX3p79KfVFjf1mRtm6AzwJTgXGEbaLv5NCB16xb0lhgFbDUzDoS/eV9bUP6umEQ1nctzvtwPJZvvp9H+EtVyf50tn2154G0NWNmT5nZa7GsBB4A8pYRoFrd/emvUl/JsfJA2roxs/Vm1m1mr5vZWsKCzluiz5p0S5oArCPs+3+trL+8r21IX/egrO+qnbeZvQzsBqaX6uINunOATRVO2Zi0jbTG+lL72yS9uZf2upOB5kocp3KugrpRg+6+2AicG/so0QrstBzdrM1AdyXOiM873oTfADxhZossbvhGcr+2IRPdlUj/865xc/8bhPjcjQSBDwG/6cW2iXDT4jrgDfHYA0yJ7cMId6O/D4wE3gnsA66t902MDDVfClwGDI/tH4/2V9db50B0R/sRsewCboivz060rwV+GftqjH3fVm+dWeoGJhNu3o2Iv+8fAP4NfLHeOgeiG7iI8HTNt3ppL8TazkD3oKzvWoUOA5YSgpEfJjwWNzq2zQe6y+xnA1uAI/E4q6y9GXgqCtwL3FrvDzNLzXEhbyE49EPAs3n8ha5Rt1UoSxLtY2Mfh2OfdwNn1VtnlrqBtwPPx35eATYDi+qtcaC6gWVRZ3dZmZ+wyf3aTlv3YK1vD0zlOI5TQHL3n22O4zhO37jzdhzHKSDuvB3HcQqIO2/HcZwC4s7bcRyngLjzdhzHKSDuvJ1CIOkxScvj66cl/bDOU3KcunJ2vSfgODUwD3i9P4aSlgCfNLNLM52R4wwy7rydwmFmB+s9B8epN75t4uQOSW+StDymjton6etl7Sdtm0iaJ2mTQsq5g5LWSxonaQFwO3BJTEtlsQ5JN8dzeiS9KOmnMa5zqc8FcfwrFdLY9UhaJ6mxbC5zY7zmI5IOSFojaURsG66Q5m1PTIn1J0kfye4n5wwl3Hk7eWQpcBXwCeBKQiS6KyoZShoP/AK4H7g42pXSV60E7iEEHLoglpWx7TiwGLiEkL7sMkJqqyRvJIT6XAi8lxAi9EeJsWcDvyIE23oXIabFev6/rpYRglBdTwhWdD+wRlKukhE4xcRjmzi5QtIo4ACw0MxWJOr2AI+Y2QJJTwObzWyRpOmEDChTzGxXhf6W0I897+iIHwVGmtnxeIW+DLjIzLZGm/nAfcAIMzNJfwT+aWbXVuivCeiI89qdqH8E2GtmX6jqB+M4ZfiVt5M3mgihNJ8pVVhIo9Vbst6NwJPA5pjx+/OSxvQ1iKSZktbGLY1SFLnhwPiE2dGS447sjTbnx/ethIh5lZhOiN/8QjKDOPDRqNFxBoQ7b6fQmNl/gVmxbCIkNO443daEpMnArwmxpj9F2PJYGJuHJ0yPlQ8Xj/1ZN2dF+/cQ4liXysWJsRynZtx5O3mjk/AY4OWlipiJpddtDws8Y2Z3EJzlXuDTsfk/hFjNSd5NcNJfjudtAybUMNfnCXvyvbUJGG9m28vKizWM5Tgn4Y8KOrnCzLol/Qy4S9J+giP+Jqc6YAAkXU5I9vpbQpaWVuBC4IVoshOYHPfGdxMC7XcQLlwWS1pN+EOxuIbpfptwA3I78CDBWc8Cfmxm2yStAJZLuoWQNb0B+CDwDzNbXcN4jnMCv/J28sithMSuD8fjZuAPvdh2ATOAxwhO+R7gTjN7ILavAh4n7E3vB64zs03Al4CbCU7+pjhmVZjZ48A1wBzClfZ6whMnx6PJDYSbnncDf49zvIKQKs1xBoQ/beI4jlNA/MrbcRyngLjzdhzHKSDuvB3HcQqIO2/HcZwC4s7bcRyngLjzdhzHKSDuvB3HcQqIO2/HcZwC8j+LRs7Q678YBwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "def plot(xs, ys, title, xlabel, ylabel):\n",
    "    figure = plt.figure()\n",
    "    plt.plot(xs, ys, color='k')\n",
    "    plt.tick_params(left=False, labelleft=False, labelsize=13)\n",
    "    plt.ylim(0, plt.ylim()[1])\n",
    "    plt.xlim(0, xs.max())\n",
    "    plt.title(title, fontsize=16)\n",
    "    plt.xlabel(xlabel, fontsize=14)\n",
    "    plt.ylabel(ylabel, fontsize=14)\n",
    "    return figure\n",
    "\n",
    "histogram = vamb.vambtools.read_npz('histogram.npz')\n",
    "xs = np.linspace(0, 1, 400)\n",
    "fig_a = plot(xs[1:], histogram, \"Figure A: Histogram of all distances\", 'distance', 'density')\n",
    "hist2 = histogram[0:120:2] + histogram[1:120:2]\n",
    "fig_b = plot(xs[0:110:2], hist2[:55], \"Figure B: Zoom-in on low X values\", 'distance', 'density')\n",
    "densities = vamb.cluster._calc_densities(hist2, cuda=False)\n",
    "plt.plot(xs[0:110:2], densities[:55])\n",
    "threshold, success = vamb.cluster._find_threshold(densities, peak_valley_ratio=0.2, cuda=False)\n",
    "plt.vlines(threshold, 0, plt.ylim()[1], linestyles='dotted')\n",
    "plt.text(threshold, plt.ylim()[1]/5, \"Clustering threshold\", rotation=90, fontsize=14)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When picking an arbitrary contig as medoid (here contig 'S4C11236' in the CAMI2 toy human Airways short read dataset), and calculating the distances to all other contigs in the dataset, it follows a distribution as seen in Figure A. Unsurprisingly, the latent representation of most other contigs are mostly uncorrelated with the chosen contig, so the large majority of points are at a distance of around 0.5. No contigs have a larger distance than 0.8. \n",
    "\n",
    "But look at the lower left corner of Figure A: A small group of contigs appear to have a smaller distance to the medoid than most others - those of the cluster it belongs to. If the cluster is well-separated, this small initial peak should be mostly isolated from the main peak around 0.5 by a band of empty space, with a low density. Figure B confirms this is true.\n",
    "\n",
    "Vamb groups the observed distances from the medoid M to a histogram similar to that in Figure B, and then iterates over the y-values of the histogram attempting to find first an initial peak, then a valley. At the bottom of the valley, the cutoff distance threshold is set, and all points closer to M than this point is clustered out. This threshold is depicted in Figure B by the dotted line. Vamb smoothes the histogram in order to mitigate the influence of random variation on the minima/maxima of the function. The smoothed hisotgram is visible in Figure B as a blue line.\n",
    "\n",
    "The algorithm for detecting the threshold is as follows:\n",
    "\n",
    "Parameter DEFAULT is 0.09 by default. RATIO is provided to the function and will be described later.\n",
    "\n",
    "    Function `find_threshold` inputs = (medoid, RATIO), outputs = (threshold, was_successful)\n",
    "    1. Calculate the distances from M to all other points. Group to histogram with bin size of 0.005 and smooth using a simple Gaussian kernel.\n",
    "    2. If there are no other points within 0.03, return (0.025, None) end function\n",
    "    3. Set success to FALSE. For X = 0 to X = 0.3:\n",
    "        If the peak has not yet been declared over:\n",
    "            If density is increasing:\n",
    "                Set peak to X, peak density to density\n",
    "                If X > 0.1, there is no initial peak near X=0, end loop\n",
    "            Else if density < 0.6 * peak density\n",
    "                Declare initial peak to be over\n",
    "                Set minimum to density\n",
    "        Else\n",
    "            If density > 1.5 * minimum, we are entering another peak, end loop\n",
    "            If density < minimum\n",
    "                Set minimum to density\n",
    "                if density < RATIO * peak density, we have successfully found a potential threshold\n",
    "                    set success to TRUE\n",
    "                    set threshold to X\n",
    "                    \n",
    "    4. If threshold is set to above 0.14 + 0.1 * RATIO: Initial peak is not close enough to X=0, set success to FALSE\n",
    "    5. If success is FALSE and RATIO > 0.55: We do not accept failure. Set success to None and threshold to DEFAULT\n",
    "    6. Return (threshold, success)\n",
    "    \n",
    "This function is annoyingly convoluted, but I was not able to find a simpler function that did not return nonsensical thresholds for a variety of different data points.\n",
    "\n",
    "You will notice that the parameter RATIO controls how strict the criteria for accepting a threshold as successful is. If RATIO is low, say, 0.1, the initial peak must lie close to zero, and must be separated from the bulk of the other points with a deep valley. In constrast, a larger RATIO makes the criteria more relaxed. With a RATIO > 0.55, the function does not even accept failure, and will set the threshold to DEFAULT if no valley is found."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Clustering\n",
    "\n",
    "With the medoid search function and the threshold detection above, you can see how to begin clustering points. However, one more problem remains: Not all medoids separate the points to a neat small peak of close contigs and a larger one of far contigs as seen in Figure B. In fact, for the majority of medoids M sampled, the density is continuously increasing with distance, or the initial peak is too far away to consider M to be part of the peak, or there is clearly an initial double-peak rather than one. To make things worse, if a medoid M cannot successfully return a threshold, this does not mean M must be discarded - it is quite likely that M is contained in a cluster for which a threshold could be successfully found using another point as medoid.\n",
    "\n",
    "To handle this problem, Vamb takes the approach of beginning with strict requirements for when a threshold is accepted, by setting the RATIO parameter low. If the threshold detection for a medoid returns failure, that medoid is simply skipped in the hope of finding another medoid with better results. Vamb then keeps count of how many of the recent attempts that yielded success - if too few did, RATIO is increased to relax the requirements for clusters.\n",
    "\n",
    "The algorithm is as follows:\n",
    "\n",
    "Parameters WINDOWSIZE and MINSUCCESSES is 200 and 15 by default, respectively. RATIO is initialized as 0.1\n",
    "\n",
    "Function `cluster`\n",
    "    1. Pick a medoid M using function `wander_medoid`\n",
    "    2. Set threshold to None. While threshold is None:\n",
    "           Set (success, threshold) to result of function `find_threshold(M, RATIO)`\n",
    "           If fewer than MINSUCCESSES of last WINDOWSIZE attempts at finding threshold succeded:\n",
    "               Increment RATIO by 0.1\n",
    "               Discard the memory of the success or failure of previous attempts\n",
    "    3. Output all points within threshold of M as a cluster. Remove these points from dataset\n",
    "    4. If no more points remain, end function. Else, go to point 1.\n",
    "    \n",
    "Let's have a look at the actual Python function that does the clustering in Vamb:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on function cluster in module vamb.cluster:\n",
      "\n",
      "cluster(matrix, labels=None, maxsteps=25, windowsize=200, minsuccesses=20, destroy=False, normalized=False, cuda=False)\n",
      "    Create iterable of (medoid, {point1, point2 ...}) tuples for each cluster.\n",
      "    \n",
      "    Inputs:\n",
      "        matrix: A (obs x features) Numpy matrix of data type numpy.float32\n",
      "        labels: None or list of labels of points [None = range(len(matrix))]\n",
      "        maxsteps: Stop searching for optimal medoid after N futile attempts [25]\n",
      "        windowsize: Length of window to count successes [200]\n",
      "        minsuccesses: Minimum acceptable number of successes [15]\n",
      "        destroy: Save memory by destroying matrix while clustering [False]\n",
      "        normalized: Matrix is already preprocessed [False]\n",
      "        cuda: Accelerate clustering with GPU [False]\n",
      "    \n",
      "    Output: Generator of (medoid, {point1, point2 ...}) tuples for each cluster.\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# From Vamb 3.0, clustering is done by the ClusterGenerator object.\n",
    "# We can use the old interface to construct a cluster iterator.\n",
    "help(vamb.cluster.cluster)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "An explanation of a few of the parameters is in order:\n",
    "    \n",
    "`matrix` is the latent encoding.\n",
    "\n",
    "`labels` is a list (or array) of labels for each point. If `None`, the labels will be the row position of the point in the matrix, e.g. the first point will be `0`.\n",
    "\n",
    "`maxsteps`, `default`, `windowsize`, and `minsuccesses` correspond to the parameters described in the algorithms above. It's not realistic to expect users to set these - in other words, they should be OK as they are.\n",
    "\n",
    "The function iteratively deletes the input matrix. To avoid this, the function works on a copy of the matrix. To skip copying the matrix and save memory, set `destroy` to `True`.\n",
    "\n",
    "For clustering, the matrix needs to be preprocessed (as described above in the clustering algorithm). If this has already been done, set `normalized` to `True`. If `destroy` is `True`, the matrix will be normalized in-place.\n",
    "\n",
    "If `cuda` is set to `True`, the clustering will be run on GPU for a significant speedup.\n",
    "\n",
    "---\n",
    "Depending on the size of the latent encoding, clustering can take quite some time. The heavy lifting here is done in PyTorch, so if you're not using a GPU, it might be worth making sure the BLAS library PyTorch is using is fast. The difference between a fast and a slow PyTorch implementation can be quite remarkable. If you use PyTorch v. >= 1.1 You can check it with `torch.__config__.show()` and notice if it says something about using Intel Math Kernel (MKL) library, or OpenBLAS. If it does, you're golden.\n",
    "\n",
    "The generator will compute the clusters on-the-fly, meaning it will only compute the next cluster *once you ask for it*. Having the clustering return a generator gives a lot of flexibility:\n",
    "\n",
    "You can manually iterate over the clusters:\n",
    "    \n",
    "    clusters = dict()\n",
    "    for n, (medoid, cluster) in enumerate(vamb.cluster.cluster(latent)):\n",
    "        clusters[medoid] = cluster\n",
    "        \n",
    "        if n + 1 == 1000: # Stop after 1000 clusters\n",
    "            break\n",
    "\n",
    "You can just put it directly in a dictionary:\n",
    "\n",
    "    clusters = dict(vamb.cluster.cluster(latent))\n",
    "    \n",
    "Or you can use the `vamb.cluster.writeclusters` function to write the clusters to disk without storing them in memory:\n",
    "\n",
    "    cluster_iterator = vamb.cluster.cluster(latent)\n",
    "    with open('clusters.tsv', 'w') as clusterfile:\n",
    "        vamb.cluster.writeclusters(clusterfile, cluster_iterator)\n",
    "        \n",
    "The `vamb.cluster.cluster` function is a thin wrapper which instantiates a `vamb.cluster.ClusterGenerator` object. This object yields `Cluster` objects. These objects contain various data, e.g. the `ClusterGenerator` contains the full state of the clusterer, and its internal parameters, whereas a `Cluster` contains all information about the cluster. If you just want the clusters and don't need access to these internal details, the simpler `vamb.cluster.cluster`, generator, yielding tuples, will be fine.\n",
    "\n",
    "In this example, we will load it into a dictionary immediately:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "First key: S3C87523 (of type: <class 'str'> )\n",
      "Type of values: <class 'set'>\n",
      "First element of value: S6C1433 of type: <class 'str'>\n"
     ]
    }
   ],
   "source": [
    "# I cheat here and load in a latent representation that has trained for 500\n",
    "# epochs instead of the shorter training we did above.\n",
    "#latent = vamb.vambtools.read_npz('/Users/jakobnissen/Downloads/example/out/latent.npz')\n",
    "\n",
    "# Notice we mask the contignames, since the dataloader could have filtered some contigs away\n",
    "filtered_labels = [n for (n,m) in zip(contignames, mask) if m]\n",
    "cluster_iterator = vamb.cluster.cluster(latent, labels=filtered_labels)\n",
    "clusters = dict(cluster_iterator)\n",
    "\n",
    "medoid, contigs = next(iter(clusters.items()))\n",
    "print('First key:', medoid, '(of type:', type(medoid), ')')\n",
    "print('Type of values:', type(contigs))\n",
    "print('First element of value:', next(iter(contigs)), 'of type:', type(next(iter(contigs))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"postprocessing\"></a>\n",
    "## Postprocessing the clusters\n",
    "\n",
    "We haven't written any dedicated postprocessing modules because how to postprocess really depends on what you're looking for in your data.\n",
    "\n",
    "One of the greatest weaknesses of Vamb - probably of metagenomic binners in general - is that the bins tend to be highly fragmented. Unlike other binners, Vamb does not hide this fact, and will bin *all* input contigs. This means you'll have lots of tiny bins, some of which are legitimate (viruses, plasmids), but most are parts of larger genomes that didn't get binned properly - about 2/3 of the bins here, for example, are 1-contig bins. \n",
    "\n",
    "We're in the process of developing a tool for annotating, cleaning and merging bins based on phylogenetic analysis of the genes in the bins. That would be extremely helpful, but for now, we'll have to use more crude approaches.\n",
    "\n",
    "First, we will split up all bins by their sample of origin using `vamb.vambtools.binsplit`. This will cause some bins to be fragmented, but for bins of high per-sample coverage, it will deduplicate them effectively. Second, we'll throw away all bins with less than 200,000 basepairs because we're only interested in genome-sized bins.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of bins before splitting and filtering: 512\n",
      "Number of bins after splitting and filtering: 192\n"
     ]
    }
   ],
   "source": [
    "def filterclusters(clusters, lengthof):\n",
    "    filtered_bins = dict()\n",
    "    for medoid, contigs in clusters.items():\n",
    "        binsize = sum(lengthof[contig] for contig in contigs)\n",
    "    \n",
    "        if binsize >= 200000:\n",
    "            filtered_bins[medoid] = contigs\n",
    "    \n",
    "    return filtered_bins\n",
    "        \n",
    "lengthof = dict(zip(contignames, lengths))\n",
    "filtered_bins = filterclusters(vamb.vambtools.binsplit(clusters, 'C'), lengthof)\n",
    "print('Number of bins before splitting and filtering:', len(clusters))\n",
    "print('Number of bins after splitting and filtering:', len(filtered_bins))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "Even as we split some bins in 9, we're still left with fewer than half of the starting number of bins! Now, let's save the clusters to disk. For this we will use two writer functions:\n",
    "\n",
    "1) `vamb.cluster.writeclusters`, that writes which clusters contains which contigs to a simple tab-separated file, and\n",
    "\n",
    "2) `vamb.vambtools.writebins`, that writes FASTA files corresponding to each of the bins to a directory. This might be useful for some types of analysis you want to do down the road.\n",
    "\n",
    "We will need to load all the contigs belonging to any bin into memory to use `vamb.vambtools.writebins`. If the contigs in your bins don't fit in memory, sorry, you gotta find another way to make those FASTA bins.\n",
    "\n",
    "The cluster name when printing either way will be the dictionary key of the bins.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This writes a .tsv file with the clusters and corresponding sequences\n",
    "with open('/Users/jakobnissen/Downloads/example/clusters.tsv', 'w') as file:\n",
    "    vamb.vambtools.write_clusters(file, filtered_bins)\n",
    "\n",
    "# Only keep contigs in any filtered bin in memory\n",
    "keptcontigs = set.union(*filtered_bins.values())\n",
    "\n",
    "with open('/Users/jakobnissen/Downloads/example/contigs.fna', 'rb') as file:\n",
    "    fastadict = vamb.vambtools.loadfasta(file, keep=keptcontigs)\n",
    "    \n",
    "bindir = '/Users/jakobnissen/Downloads/example/bins'\n",
    "vamb.vambtools.write_bins(bindir, filtered_bins, fastadict, maxbins=500)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"summary\"></a>\n",
    "## Summary of full workflow\n",
    "\n",
    "This is the full default workflow from beginning to end. Calling Vamb from command line does essentially this, except with some input validation, logging, and saving intermediate results to files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "import os\n",
    "sys.path.append('/Users/jakni/Documents/scripts/vamb')\n",
    "import vamb\n",
    "\n",
    "# Calculate TNF\n",
    "with open('/Users/jakni/Downloads/example/contigs.fna', 'rb') as contigfile:\n",
    "    tnfs, contignames, contiglengths = vamb.parsecontigs.read_contigs(contigfile)\n",
    "\n",
    "# Calculate RPKM\n",
    "bamdir = '/Users/jakni/Downloads/example/bamfiles/'\n",
    "bampaths = [bamdir + filename for filename in os.listdir(bamdir) if filename.endswith('.bam')]\n",
    "rpkms = vamb.parsebam.read_bamfiles(bampaths)\n",
    "\n",
    "# Encode\n",
    "vae = vamb.encode.VAE(nsamples=rpkms.shape[1])\n",
    "dataloader, mask = vamb.encode.make_dataloader(rpkms, tnfs)\n",
    "vae.trainmodel(dataloader)\n",
    "latent = vae.encode(dataloader)\n",
    "\n",
    "# Cluster and output clusters\n",
    "filtered_labels = [n for (n,m) in zip(contignames, mask) if m]\n",
    "cluster_iterator = vamb.cluster.cluster(latent, labels=filtered_labels)\n",
    "with open('/Users/jakni/Downloads/example/bins.tsv', 'w') as binfile:\n",
    "    vamb.cluster.write_clusters(binfile, cluster_iterator)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"memory\"></a>\n",
    "## Running VAMB with low memory (RAM)\n",
    "\n",
    "In the VAMB software, a series of tradeoffs can be taken to decrease RAM consumption, usually to the detriment of some other property like speed or convenience (though not accuracy). With all of those tradeoffs taken, VAMB is relatively memory efficient. By default, several of these tradeoffs are are not taken when running from a Python interpreter, however they are all enabled when running VAMB from command line.\n",
    "\n",
    "The memory consumption of the encoding step is usually the bottleneck. With all memory-saving options enabled, this uses approximately $4 * N_{contigs}*(103+N_{latent}+N_{samples})$ bytes, plus some hundreds of megabytes of overhead. As a rule of thumb, if you don't have at least two times that amount of memory, you might want to enable some or all of these memory optimizations.\n",
    "\n",
    "Here's a short list of all the available memory saving options:\n",
    "\n",
    "- Pass a path to the `dumpdirectory` in the `vamb.parsecontigs.read_bamfiles` function. Temporary files will be written to files in this directory to avoid keeping them in memory. This takes a little bit of disk space, but saves lots of RAM\n",
    "- In the `encode.make_dataloader` function, set `destroy` to True. This will in-place modify your RPKM and TNF array, deleting unusable rows and normalizing them. This prevents the creation of a copy of the data.\n",
    "- Similarly, set `destroy=True` when using the `cluster.cluster` function. Again, the input (latent) datasets are modified in-place, saving a copy of the data. This function will completely consume the input array.\n",
    "- When clustering, instead of generating clusters in memory, instantiate the cluster iterator and pass it directly to `cluster.write_clusters`. This way, each cluster will be written to disk after creation and not accumulate in memory. This obviously requires disk space, and you will have to reload the clusters if you are planning on using them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"benchmark\"></a>\n",
    "## Optional: Benchmarking your bins\n",
    "\n",
    "If you want to tweak or enchance Vamb, you'll want to know how well it performs. For this to make any sense, you need to have a *reference*, that is, a list of bins that are deemed true and complete. Otherwise, what do you benchmark against?\n",
    "\n",
    "Vamb's benchmarking works in the following way: You count the number of genomes for which *any* bin has a competeness (recall) above a certain level and a contamination below a certain level (i.e. precision above a certain level). Recall and precision are calculated by the number of basepairs with >= 1x coverage in the right genome versus wrong genomes. In other words, for a bin B and a genome G, the recall is (# basepairs of G covered by a contig in B) / (# basepairs of G covered by any contig), and precision is (# basepairs of G covered by a contig in B) / (# basepairs of any genome covered by a contig in B). An bin *can* count towards multiple genomes given a low enough precision threshold.\n",
    "\n",
    "Let's go through how to benchmark your bins.\n",
    "\n",
    "---\n",
    "#### Genome reference file\n",
    "First, you need a proper reference. For each contig (or sequence) you are binning, you need to provide:\n",
    "* The name of the contig\n",
    "* The name of the genome the contig is coming from (genome names may be arbitrarily chosen)\n",
    "* The name of the reference contig from that genome (may also be arbitrarily chosen)\n",
    "* The leftmost position (start) of where the contig aligns to the reference contig\n",
    "* The rightmost position (end) of where the contig aligns to the reference contig.\n",
    "\n",
    "This information is given as a tab-separated file, with one line per contig. For example, suppose that I have a contig `test_contig_45` which I have located to belong to position 113501 to position 132884 in the reference contig `NC_013740.1` of the genome of `Acidaminococcus fermentans`. Then, one of the lines in the reference file will be:\n",
    "\n",
    "    test_contig_45     Acidaminococcus fermentans       NC_013740.1    113501     132884\n",
    "\n",
    "This file can then be loaded into a `Reference` object:\n",
    "    \n",
    "    with open('/path/to/ref_file.tsv') as ref_file:\n",
    "        reference = Reference.from_file(ref_file)\n",
    "        \n",
    "#### Reference taxonomy file\n",
    "One can choose to benchmark at different taxonomic levels: A bin that is completely uncontaminated on the genus level may not be pure species-wise. Vamb allows you to benchmark at arbitrary taxonomic levels.\n",
    "\n",
    "The reference file described above is taken to represent the lowest (i.e. most specific) taxonomic level. In the taxonomy file, each genome of the reference file is then assigned to one higher rank. Different genomes may be assigned to the same higher rank, but each genome can only be assigned to one. These higher ranks may then be assigned yet higher ranks, and so on. For example's sake, let us assume the lowest \"Genome\" level in the genome reference file represents strains, and the higher ranks are species, genera and families, in order.\n",
    "\n",
    "In the reference file, this is written as a tab-separated text file with one line per genome, and one column per taxonomic level. Note that the leftmost column is taken to be genome names given in the genome reference file. So, for the example line above of the genome reference file, if a reference taxonomy file is loaded, there *must* be a line where the leftmost column is `Acidaminococcus fermentans`. If the true species/genus/whatever of a genome is unknown, you can assign it to an arbitrarily named species/genus. For example, a reference taxnomy file may begin with the following four lines - note that `Strain_118` is assigned an arbitrary species, genus and family.\n",
    "\n",
    "    Strain_455         Bacillus subtilis         Bacillus           Bacillaceae\n",
    "    Strain_255         Bacillus licheniformis    Bacillus           Bacillaceae\n",
    "    Strain_1067        Clostridium tetani        Clostridium        Clostridiaceae\n",
    "    Strain_118         Strain_118                Strain_118         Strain_118\n",
    "    \n",
    "The definitions given above for *recall* and *precision* only makes sense given known mapping locations for contigs, which are specified in the genome reference file. However, the mapping location of a contig to e.g. a genus is a nonsensical concept. So, how is recall and preicison defined for higher taxonomic levels defined?\n",
    "\n",
    "The recall of a higher taxonomic level is the maximum of all of its members. The precision of a higher taxonomic level is the sum of all of its members.\n",
    "\n",
    "A reference taxonomy file can be added to a `Reference` object like so:\n",
    "\n",
    "    with open('/path/to/tax_file.tsv') as tax_file:\n",
    "        reference.load_tax_file(tax_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then you need to load your observed bins. The format of this file is expected to be the same as the output of Vamb, namely a two-column file, with arbitrary bin names in the first column and contig names in the second column. Any contig seen in the observed bins is expected to be present in the reference as well. You also need to provide the reference:\n",
    "\n",
    "    with open('/path/to/clusters.tsv') as clusters_file:\n",
    "        bins = Binning.from_file(clusters_file, reference)\n",
    "        \n",
    "You can the query the bins for information about the quality of your bins. Below I'll show an example of this in practice: I will use Vamb's (default) performance on the `metabat_errorfree` dataset and compare it to that of MetaBAT2."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_1-5871\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t1\t5871\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_5841-8340\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t5841\t8340\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_8310-10809\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t8310\t10809\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_10779-29944\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t10779\t29944\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_29914-33073\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t29914\t33073\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_33043-41174\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t33043\t41174\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_41144-44994\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t41144\t44994\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_44964-53996\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t44964\t53996\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_53966-59194\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t53966\t59194\r\n",
      "gi|224815735|ref|NZ_ACGB01000001.1|_[Acidaminococcus_D21_uid55871]_59164-65356\tAcidaminococcus_D21_uid55871\tNZ_ACGB01000001.1\t59164\t65356\r\n"
     ]
    }
   ],
   "source": [
    "# First load in the Reference\n",
    "reference_path = '/Users/jakobnissen/Downloads/vambdata/metahit/reference.tsv'\n",
    "\n",
    "!head $reference_path # show first 10 lines of reference file\n",
    "\n",
    "with open(reference_path) as reference_file:\n",
    "    reference = vamb.benchmark.Reference.from_file(reference_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "The first 10 lines wrap, but you can see the information expected to be present.\n",
    "\n",
    "The `reference` object contains a bunch of attributes which keeps track of which contigs belongs to which bins. You can see which ones using good ol' `help`:\n",
    "\n",
    "`>>> help(reference)`\n",
    "\n",
    "    class Reference(builtins.object)\n",
    "     |  Reference(genomes, taxmaps=[])\n",
    "     |  \n",
    "     |  A set of Genomes known to represent the ground truth for binning.\n",
    "     |  Instantiate with any iterable of Genomes.\n",
    "     |  \n",
    "     |  >>> print(my_genomes)\n",
    "     |  [Genome('E. coli'), ncontigs=95, breadth=5012521),\n",
    "     |   Genome('Y. pestis'), ncontigs=5, breadth=46588721)]\n",
    "     |  >>> Reference(my_genomes)\n",
    "     |  Reference(ngenomes=2, ncontigs=100)\n",
    "     |  \n",
    "     |  Properties:\n",
    "     |  self.genomes: {genome_name: genome} dict\n",
    "     |  self.contigs: {contig_name: contig} dict\n",
    "     |  self.genomeof: {contig: genome} dict\n",
    "     |  self.breadth: Total length of all genomes\n",
    "     |  self.ngenomes\n",
    "     |  self.ncontigs\n",
    "\n",
    "        [ ... ]\n",
    "    \n",
    "Now, I can load in the taxonomy file:\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Acidaminococcus_D21_uid55871\tAcidaminococcus_D21_uid55871\tAcidaminococcus\r\n",
      "Acidaminococcus_fermentans_DSM_20731_uid43471\tAcidaminococcus fermentans\tAcidaminococcus\r\n",
      "Acidaminococcus_intestini_RyC_MR95_uid74445\tAcidaminococcus intestini\tAcidaminococcus\r\n",
      "Actinomyces_ICM47_uid170984\tActinomyces_ICM47_uid170984\tActinomyces\r\n",
      "Adlercreutzia_equolifaciens_DSM_19450_uid223286\tAdlercreutzia equolifaciens\tAdlercreutzia\r\n",
      "Aeromicrobium_JC14_uid199535\tAeromicrobium_JC14_uid199535\tAeromicrobium\r\n",
      "Akkermansia_muciniphila_ATCC_BAA_835_uid58985\tAkkermansia muciniphila\tAkkermansia\r\n",
      "Alcanivorax_hongdengensis_A_11_3_uid176602\tAlcanivorax hongdengensis\tAlcanivorax\r\n",
      "Alistipes_AP11_uid199714\tAlistipes_AP11_uid199714\tAlistipes\r\n",
      "Alistipes_HGB5_uid67587\tAlistipes_HGB5_uid67587\tAlistipes\r\n"
     ]
    }
   ],
   "source": [
    "taxonomy_path = '/Users/jakobnissen/Downloads/vambdata/metahit/taxonomy.tsv'\n",
    "\n",
    "!head $taxonomy_path # show first 10 lines of reference file\n",
    "\n",
    "with open(taxonomy_path) as taxonomy_file:\n",
    "    reference.load_tax_file(taxonomy_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "Now. we may load the binning itself. I'll automatically filter all bins smaller than 200,000 basepairs away:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('/Users/jakni/Downloads/vamb/results/default_vamb/metahit/clusters.tsv') as clusters_file:\n",
    "    vamb_clusters = vamb.cluster.read_clusters(clusters_file)\n",
    "    vamb_bins = vamb.benchmark.Binning(vamb_clusters, reference, minsize=200000)\n",
    "    \n",
    "with open('/Users/jakni/Downloads/vamb/results/default_metabat/metahit/clusters.tsv') as clusters_file:\n",
    "    metabat_clusters = vamb.cluster.read_clusters(clusters_file)\n",
    "    metabat_bins = vamb.benchmark.Binning(metabat_clusters, reference, minsize=200000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "The Binning object keeps track of your bins and how they relate to the `Reference`:\n",
    "\n",
    "    help(vamb_bins)\n",
    "    \n",
    "    class Binning(builtins.object)\n",
    "     |  Binning(contigsof, reference, recalls=[0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.99], precisions=[0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.99], checkpresence=True, disjoint=True, binsplit_separator=None, minsize=None, mincontigs=None)\n",
    "     |  \n",
    "     |  The result of a set of clusters applied to a Reference.\n",
    "     |  >>> ref\n",
    "     |  (Reference(ngenomes=2, ncontigs=5)\n",
    "     |  >>> b = Binning({'bin1': {contig1, contig2}, 'bin2': {contig3, contig4}}, ref)\n",
    "     |  Binning(4/5 contigs, ReferenceID=0x7fe908180be0)\n",
    "     |  >>> b[0.5, 0.9] # num. genomes 0.5 recall, 0.9 precision\n",
    "     |  1\n",
    "     |  \n",
    "     |  Init arguments:\n",
    "     |  ----------- Required ---------\n",
    "     |  contigsof:     Dict of clusters, each sequence present in the Reference\n",
    "     |  reference:     Associated Reference object\n",
    "     |  ----------- Optional ---------\n",
    "     |  recalls:       Iterable of minimum recall thresholds\n",
    "     |  precisions:    Iterable of minimum precision thresholds\n",
    "     |  checkpresence: Whether to raise an error if a sequence if not present in Reference\n",
    "     |  disjoint:      Whether to raise an error if a sequence is in multiple bins\n",
    "     |  binsplit_separator: Split bins according to prefix before this separator in seq name\n",
    "     |  minsize:       Minimum sum of sequence lengths in a bin to not be ignored\n",
    "     |  mincontigs:    Minimum number of sequence in a bin to not be ignored\n",
    "     |  \n",
    "     |  Properties:\n",
    "     |  self.reference:       Reference object of this benchmark\n",
    "     |  self.recalls:         Sorted tuple of recall thresholds\n",
    "     |  self.precisions:      Sorted tuple of precision thresholds\n",
    "     |  self.nbins:           Number of bins\n",
    "     |  self.ncontigs:        Number of binned contigs\n",
    "     |  self.contigsof:       {bin_name: {contig set}}\n",
    "     |  self.binof:           {contig: bin_name(s)}, val is str or set\n",
    "     |  self.breadthof:       {bin_name: breadth}\n",
    "     |  self.intersectionsof: {genome: {bin:_name: intersection}}\n",
    "     |  self.breadth:         Total breadth of all bins\n",
    "     |  self.counters:        List of (rec, prec) Counters of genomes for each taxonomic rank\n",
    "     |  \n",
    "     |  Methods defined here:\n",
    "     \n",
    "     [ ... ]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "The easiest way to get information about the quality of a `Binning` is by using the `summary` method. This prints the number of genomes reconstructed for each taxonomic level for each recall threshold at a precision threshold of 0.9:\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Vamb bins:\n",
      "114\t112\t109\t106\t96\t67\t28\t8\t0\n",
      "129\t128\t125\t123\t112\t78\t35\t11\t0\n",
      "59\t59\t58\t58\t57\t50\t30\t11\t0\n",
      "\n",
      "METABAT2 bins:\n",
      "101\t93\t88\t82\t61\t29\t4\t0\t0\n",
      "116\t108\t102\t96\t73\t35\t5\t0\t0\n",
      "51\t48\t48\t48\t43\t32\t6\t2\t0\n"
     ]
    }
   ],
   "source": [
    "print('Vamb bins:')\n",
    "for rank in vamb_bins.summary():\n",
    "    print('\\t'.join(map(str, rank)))\n",
    "print('\\nMETABAT2 bins:')\n",
    "for rank in metabat_bins.summary():\n",
    "    print('\\t'.join(map(str, rank)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "On the [metabat webpage](https://bitbucket.org/berkeleylab/metabat/wiki/CAMI) they have a neat plot where they plot the number of observed bins at different recalls for a specific specificity. Just for fun, let's recreate that here with our data, using the species level.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtkAAACjCAYAAABfawIQAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3de3xU9Z3/8fcnA4HEXICAU9ICioDRkkRkellbkQppcL0sUrpVsaIsam1rvXS7WuzFny1VF9ttXV2pCFUUW3TVrdVuwCjU+lOqRoWIBlQCKBg0CEzumOS7f5wzdRgmkAwTJgmv5+Mxj2S+l/P9nByGxyfffM/3mHNOAAAAAJInLdUBAAAAAH0NSTYAAACQZCTZAAAAQJKRZAMAAABJRpINAAAAJBlJNgAAAJBkJNkAAABAkpFkA0APYmZpZnaNmVWZWbOZvWtmvzSzozrZP2hmC/1+e81sq5n9xswGxWl7o5m5Dl7/mvyzA4AjR79UBwAA2Md/SPqepMck/VLSCf77CWY21TnX3lFHMzta0t8k5Uv6raTXJY2XdIWkSWb2JedcY5yu10iqjSmrONQTAYAjGUk2APQQZvZZSVdKetQ597Wo8mpJt0s6T9KDBzjEPEmjJF3gnPt9VP/n/X7XSvp5nH7/45zbfMgnAAD4O5aLAEDPcb4kk/TrmPJFkholXXiQ/l+R1CTpDzHlyyU1S7qko45mlmNmTLwAQJKQZANAz/E5Se2SXowudM41S3rNrz+QAZKanXMupn+7vOR7tJkNjdNvnaQ9kprN7HkzOyPB+AEAPpJsAOg58iXVOuda4tRtkzTUzNIP0H+9pMFmdlJ0of9+sP92ZFTVbkl3y1ui8k+SfihvucmTZnZxQmcAAJAkWcyEBwAgRczsHUn9nXMj49QtlfRNSYOdc7s76H+qpNWS3pF0tbwbHz8rb/nJsZL6SzrVOffcAWLI8/sNlDTCOVd/KOcEAEcqZrIBoOdolLfkI56BUW3ics79Vd7NkdmSnpS0RdKfJK2S9ITfLHygAJxzOyUtlDRI0imdDRwAsC9ucgGAnmO7pBPNbECcJSOflreUZO+BDuCce9jMHpVUKC/Z3uCc+8DMXpTUKuntTsSx2f8ab/02AKATmMkGgJ7jJXn/L38+utDMBko6SdLLnTmIc67NOfeac+6vfoL9KUkTJP2lg32yY431v+7ofOgAgGgk2QDQcyyX5OStp452qaRMScsiBWZ2nJkVHOyAZpYmb4/tgKT5UeX9zCw3TvsR8h5es1PS8wmcAwBALBcBgB7DOVdpZndK+q6/5OPP+uSJj3/Rvg+ieVreTiAWKTCzLHnb/z0mqVpSrry9tydKusE5tyqqf5akajP7H0lvStol6XhJc/26851zTd1xngBwJCDJBoCe5Wp5a6Ivk3SmvMed/6eknxzokeq+vfL2vL5A0nB5N0m+JGmac25FTNsmSY9I+oKk6fIS61pJ5ZL+3Tn3ogAACWMLPwAAACDJmMkGAADAIamoqEjv16/fIklflncPSF/XJum51tbWSydOnBh31yeSbAAAAByStLS0K3Jycr40atSo3WlpaX1+mUR7e7tt2bLly7t3775C0m/itWF3EQAAABySQCBwSX5+fsORkGBLUlpamsvPz68PBAIXd9jm8IUDAACAvsg5l5uenv5xquM4nNLT0z92zu23FWoESTYAAAAOlZnZwVv1If75dphLsyY7xQYNGuTGjBmT6jCQgIaGBh111FGpDgNdxHXrvbh2vRfXru+oqKiodc4N645jBwKBiWPHjm1qa2uzESNGtDz00EPVQ4cObUvW8W+//fa8l19++ailS5duvfbaa/OzsrLabrrppm57si1JdooFg0G9/HKnnpSMHmb16tWaPHlyqsNAF3Hdei+uXe/Ftes7zGxLdx17wIAB7VVVVW9I0owZM45ZsGDBsFtvvbWmu8brbiwXAQAAQI/yxS9+sWHbtm3pkfc//vGPg+PHjz9h3LhxJ15zzTX5kfI77rgjb9y4cScef/zxJ06fPv1YSXrwwQdzi4qKCk444YQTTznllHHvvvtuSiaVmckGAABAj9Ha2qpVq1Zl/8u//EutJD366KM5b7/99sB169a96ZzT1KlTx/zv//5v1rBhw1pvu+224S+88ELV8OHDW3fs2BGQpJKSkvrzzjuvKi0tTb/61a+G3nTTTZ9atGjRe4f7PEiyAQAAkHItLS1pBQUFJ27bti19/PjxjdOnTw9LUllZWc6zzz6bc+KJJ54oSY2NjWlVVVUDX3nllbSzzz571/Dhw1slKRgMtklSdXV1+vTp0z/z4Ycf9t+7d2/aiBEjWlJxPiTZKbZx40Z1x924U6dOTfoxsb/58+enOgRJ0sSJE1MdQq9SVlaW6hB6jZ72b+vhhx9OdQjdrrt+5qNHj+6W4wLJElmTvXPnzsBXv/rVMbfccsvRP/rRjz5wzunqq69+/wc/+EFtdPuf//znR5vZfvtyf/e73x151VVX1cyaNWvPE088kX3TTTflx7Y5HFiTDQAAgB4jLy+v7fbbb9965513BltaWuyMM84I33///UP37NmTJknV1dX9t23b1m/atGnhxx9/fEhNTU1AkiLLRerq6gIjR478WJLuvffevFSdBzPZAAAA6FG+9KUvNZ1wwglN99xzz+DvfOc7H61fv37g5z73uQJJyszMbF+2bFl1KBRq/v73v//+qaeeWpCWlubGjx/f+Mgjj2y+4YYbtp9//vnHBYPBvaFQqGHr1q0DUnEO5twR8fTLHivenzmSgeUiR5ae9id99B382zr8+uJyEbbw6zvMrMI5F4otX7t27ebi4uLaeH36srVr1w4tLi4+Jl4dy0UAAACAJCPJBgAAAJKMJBsAAABIMpJsAAAAIMlIsgEAAIAkI8kGAAAAkowkGwAAAEgykmwAAAAgyUiyAQAA0Kft2LEjUFJSclxGRsaE/Pz8woULFw6J1+7aa6/N79ev38mZmZkTIq833ngjPZExeaw6AAAAkm7Tpk3d+sjY0aNHV3S27dy5c0emp6e7mpqatWvWrMmcOXPmmFAo1BgKhZpj25555pm7/vjHP1Yfanwk2QAAAOizwuFwWllZ2eCKior1ubm57aWlpfVTpkzZs2TJkrxQKLStu8ZluQgAAAD6rMrKygGBQEBFRUUtkbKioqLGqqqqjHjtn3nmmdzc3NyTxowZ89lbb711WKLjdirJNrOLzcz5r3Fx6idH1U/tSgBmdrWZzehKn5j+N0aN7cys1cy2mNliM/v0AfqV++2/F1M+JuZ4Hb3Kzayfmf2bma0ysx1mVmdmFWZ2iZlZoucEAACA5KirqwtkZWW1RZfl5ua21dfXB2Lbzpo166PKysr1O3fufO2uu+7afNtttw3/7W9/G3f99sF0dSa7TtI345Rf5Ncl4mpJCSfZUb4s6R8kfUXSLySdKelJM9vvHM1shN9OkmbHVL/rHyfy+rJfvjim/EpJWZJ+KGmdpEslnSvpWUlL/BgAAACQQtnZ2W0NDQ375IPhcHi/xFuSJk6c2HzMMcd83K9fP5WUlDRceumlHzz66KODExm3q2uyH5V0oZn9xDnnJMnMMiR9TdIjki5OJIgk+ZtzrtX//q9m1iZpkaTjJb0Z0/ab8n7B+LOkfzSz8c651yXJOdciaU2koZlFfkbvOefWRB/ErxvtnNsVVVxuZnmSrjazG/3jAQAAIAUKCwtbWltbrbKyckBhYWGLJK1bty6joKCg6WB9zUx+yttlXZ3Jvl/SKH0yuyt5s7cBeUl2bGCnmdnT/jKKBjNbYWbjo+o3+8ebFbUM416/boyZ3W9m1WbWZGabzOwuM+vsbxNh/2v/OHUXSXpD3ix65H2XOedaYxLsiJckDZSU0J8XAAAAkBw5OTntpaWlu+fNm5cfDofTVq5ceVR5efmgOXPm7Ixt+8ADDwz68MMPA+3t7Vq1alXmokWLjj777LN3JzJuV5PsLfKWQ0QvGblI0mOS6qMbmtmZkp72yy+UdIGkbHmzzCP8ZudKqpG0Qp8sw/iZX5cv6T15iXCppJskTZE3+xxPwF8jnWFmEyXNk7Re0usxcX1R3uz2UufcW5JekDc7v9+6nENwmqSPJH2QxGMCAAAgAYsXL97S1NSUFgwGi2fPnj16wYIFW0OhUHNZWVlWZmbmhEi75cuXDx47dmxhVlbWhDlz5hz7ve99r+bKK6/cLxnvjES28Fsq6Zf+DYODJU2VdEacdr+R9Bfn3D9FCsxslaRNkr4v6Wrn3Ktm1iKpNnYphnPuWXkJfaTv85LelpekT3DOvRozXuw+h1WSznLOtceUz5bULukB//19khZKKpFUdsAz7wQz+0d5y2eud87tt9YHAADgSNCVfay7WzAYbCsvL38ntnzatGn1jY2Nf88p//SnPx3y/tgRiSTZD0u6Q9LZ8pZ61MibsZ4UaWBmYyUdJ+kXUWuaJalR3szxJB2EmaVL+ld5M+Wj5C2/iDheUmyS/UVJbfJm50dJuk7SSjM7xTm3wz/mAEnfkPSMcy6yL+Jyeb8QXKRDTLLNrFDSMklPSbrtAO0uk3TZoYwFAEBXrF69OmVj19fXp3R8IBW6nGQ75+rM7H/kLRk5RtIy51x7zI51R/tfF/uvWFs7MdTN8nbwuEnS8/J2L/mMvJsvB8ZpXxF14+OLZvaspPclXSsv4Zakc+TNvj9mZoOi+q6QNN3McpxzYSXAzMZIWinpLUkzDjSL7Zy7W9Ldfr/EVtMDANAFkydPTtnYq1evTun4QCok+sTHpZKelDdrfH6c+sjalR9KKo9Tv7cTY5wnb930zyMFZpbV2QCdczvMrFZSUVRxZLu+O/1XrH+WdE9nx4iKa6S82fyPJJ3hnKs/SBcAAAD0YYkm2U9JekjSbufc+jj1GyRtlvRZ59wtBzlWi6R4T9zJlPRxTNklnQ3QzIZLGirpQ/99UN4NlH+U9Os4XX4vb8lIl5Js/7jlfqwlzrmEFscDAACg70goyfaXQsSbwY7UOzP7jqQ/+murH5JUKyko6RRJW51zv/KbvyHpVDM7S9767lrn3GZ566Nnm1mlvBseZ/h9O/IFf2/syJrsH8hbo73Qr58l73z/wzn3l9jOZnafpH8zs9HOuU2d+DHIzDLlLTUZIe8XgJH+rHbEeudcog/pAQAAQC+V6Ez2QTnn/mxmkyTdIG92OENeEr1G3s2GET+U99CYh/w298l7qM2VkkzSfL/dn+Ul9i92MORzkaH9cSokfcs5F2k/W9I7itqxJMYSeWu3L5J0Y+fOUvmSiv3vfx+n/tSouAAAAHCE6FSS7Zy7V9K9B2mzWl5SHF32gqSzDtKvSl4yGlteK29ddqzYMW5UJ5Ji51zxQeo3xh7bL2+NV+7Xvd1RHQAAAI5cXX0YDQAAAICDIMkGAAAAkowkGwAAAH3ajh07AiUlJcdlZGRMyM/PL1y4cOGQjto+99xzmaFQ6PjMzMwJeXl5xT/72c+O7qjtgXTbjY8AAAA4cj388MMTu/P4X//61zv92Pa5c+eOTE9PdzU1NWvXrFmTOXPmzDGhUKgxFAo1R7d7//33+51zzjlj58+f/+7FF1+8q7m52aqrq9MTiY8kGwAAAH1WOBxOKysrG1xRUbE+Nze3vbS0tH7KlCl7lixZkhcKhbZFt50/f35w0qRJ4SuuuOIjScrIyHCDBw9ujn/kA2O5CAAAAPqsysrKAYFAQEVFRS2RsqKiosaqqqr9Hob48ssvHzV48ODWCRMmFAwZMqT49NNPH/PWW28lNJNNkg0AAIA+q66uLpCVldUWXZabm9tWX18fiG1bU1OT/t///d95v/71r7e+995760aOHNnyjW98Y3Qi47JcBAAAAH1WdnZ2W0NDwz4Ty+FweL/EW5IGDBjQXlpauvu0005rlKRbbrll+/Dhw0/auXNnIC8vb7/2B8JMNgAAAPqswsLCltbWVqusrBwQKVu3bl1GQUFBU2zbE044ocnsk+cMRr53znV5XJJsAAAA9Fk5OTntpaWlu+fNm5cfDofTVq5ceVR5efmgOXPm7IxtO2fOnNoVK1YMev755zNaWlps3rx5+SeffHL90KFDuzSLLZFkAwAAoI9bvHjxlqamprRgMFg8e/bs0QsWLNgaCoWay8rKsjIzMydE2p1zzjl1N9xww7bp06ePHTZsWHF1dfWA5cuXb0pkTNZkAwAAIOm6so91dwsGg23l5eXvxJZPmzatvrGx8dXosuuuu+7D66677sNDHZOZbAAAACDJSLIBAACAJCPJBgAAAJKMNdkpNm7cOG3YsCHVYSABq1ev1uTJk1MdBrqI69Z7ce0A9CbMZAMAAABJRpINAAAAJBlJNgAAAJBkJNkAAABAkpFkAwAAAElGkg0AAIA+bceOHYGSkpLjMjIyJuTn5xcuXLhwSLx2kyZNGpuZmTkh8urfv//J48aNOzGRMdnCDwAAAEl3/fXXT+zO499yyy2dfmz73LlzR6anp7uampq1a9asyZw5c+aYUCjUGAqFmqPbPfvss29Fv//85z9//KRJk8KJxEeSDQAAgD4rHA6nlZWVDa6oqFifm5vbXlpaWj9lypQ9S5YsyQuFQts66rdhw4b0ioqKrKVLl1YnMi7LRQAAANBnVVZWDggEAioqKmqJlBUVFTVWVVVlHKjfokWL8iZOnFhfUFCwN5FxSbIBAADQZ9XV1QWysrLaostyc3Pb6uvrAwfq99BDD+VdeOGFtYmOy3KRFNu4caPMLNVhoA+aOnVqqkPosebPn5/qENAJEyfuv5yzrKwsBZEgGVJ17eL9O+ptDnYOo0ePPkyR9E7Z2dltDQ0N+0wsh8Ph/RLvaCtWrMiqra3tP3v27F2JjstMNgAAAPqswsLCltbWVqusrBwQKVu3bl1GQUFBU0d9fve73+WVlpbuys3NbU90XJJsAAAA9Fk5OTntpaWlu+fNm5cfDofTVq5ceVR5efmgOXPm7IzXvr6+3p588snBl1xySdz6ziLJBgAAQJ+2ePHiLU1NTWnBYLB49uzZoxcsWLA1FAo1l5WVZWVmZk6Ibrts2bLB2dnZbWeddVbdoYxpzrlDixqHxMy4AOgWrMlGb9cX1tIi9frCv6OetCbbzCqcc6HY8rVr124uLi5O+CbB3mrt2rVDi4uLj4lXx0w2AAAAkGQk2QAAAECSkWQDAAAASUaSDQAAACQZSTYAAACQZCTZAAAAQJKRZAMAAABJRpINAAAAJBlJNgAAAJBkJNkAAADo03bs2BEoKSk5LiMjY0J+fn7hwoULh8Rr19TUZBdccMHIvLy84tzc3JNOP/30MdXV1f0TGbPfoYUMAAAA7K+kpKRbn2n/1FNPVXS27dy5c0emp6e7mpqatWvWrMmcOXPmmFAo1BgKhZqj282fP//oioqKrNdee219Xl5e2wUXXHDM5ZdfPnLlypXvdDW+XjOTbWZ/NLOPzGxAB/XZZtZgZvcepnj6mZkzsxsPx3gAAADounA4nFZWVjb45ptv3pabm9teWlpaP2XKlD1LlizJi21bXV094Ctf+Up4xIgRrZmZme688877aOPGjRmJjNtrkmxJ90kaLOmsDupnSsr02wEAAACqrKwcEAgEVFRU1BIpKyoqaqyqqtoveb788strX3zxxazNmzf3r6urS1u2bNmQ008/fU8i4/am5SJPSNop6SJJj8Spv0jSVkmrD2NMAAAA6MHq6uoCWVlZbdFlubm5bfX19YHYtuPHj2/+9Kc/3XLssccWBQIBjR07tumee+7ZkMi4vWYm2zm3V9IfJJ1hZkOj68xspKTTJN3vnHNmNs7MHjCzzWbWZGbvmNmdZjYopl+kzRfM7AW/bZWZTfPrf2BmW8xsj5k9FjvuJ4exH5vZNjNrNrO/mFlhN/0YAAAA0AXZ2dltDQ0N++S84XB4v8Rbki6++OJRzc3NaTU1Na/V1dW9ctZZZ+0qKSkZm8i4vSbJ9t0nqb+kb8SUXyjJJC31339a0hZJV0kqlTTf//pEnGMOlvQ7SXdLOlfSR5IeNbNfSfqSpG9LulbSVEm3x+k/R9JXJX1H0iWS8iU9E5vQAwAA4PArLCxsaW1ttcrKyr/f17du3bqMgoKCpti2b775Zubs2bN3BoPBtoyMDHfdddd9UFlZedT777/f5dUfvWm5iJxzL5nZG/KWhtwZVfVNSS845zb67VZJWhWpNLPnJW2StMrMCp1zlVF9cySd4Zx73m/7gaQKSdMkjXfOtfvlxZIuN7O0SJlvgKRS51yj3+5FSRvkJfj/L3lnDwAAgK7KyclpLy0t3T1v3rz8ZcuWbVmzZk1GeXn5oFWrVlXFti0uLm64//77884444y6rKys9ttuu23YsGHDPh4+fHhrV8ftVUm2b6mkW8xsnHNuo5l9XlKBpCsiDfwdSH4gb4Z7lKSBUf2PlxSdZIcjCbYv8gN/KiaZrpKULuloSTVR5U9EEmxJcs69Y2YvSfqHjk7AzC6TdNlBzxQAAOAgVq9eneoQerzFixdvmTVr1jHBYLB40KBBrQsWLNgaCoWay8rKsmbMmDG2sbHxVUm644473r3ssstGjh07dvzHH39s48aNa1q+fPnbiYzZG5PsByT9Qt5s9o/8ry2Slke1+Xd5SfeNktZIqpOXbD+sfRNuSdoV837vQcpj+++IE+MOScd1dALOubvlLU+RmbmO2gEAABzM5MmTUx1CXF3Zx7q7BYPBtvLy8v32up42bVp9JMGWpE996lNtjz/+eHUyxuxta7LlnNsmqVzShWaWLm999uPOueik+DxJS5xzv3DOPeOce0lSQtuvdEKwg7Jt3TQeAAAAerhel2T77pM3M32zpKH65IbHiAxJH8eUXdJNsZxlZpmRN2Z2nKTPSXqhm8YDAABAD9cbl4tI0mOSwpKukfSBpLKY+hWS5vg3Sb4j6euSPt9NsbRIWmFmt8lL7n8mb6nJb7ppPAAAAPRwvXIm2znXJG99tUl60DkXe8fntyU9KW+me7m8ddSzuimcJZJWSvovSfdK2i5pinNudzeNBwAAgB6ut85kyzk3V9LcDuo+lPTPcaospt2Fcfq2xrbzy++RdM8B2v2sU4EDAACgz+uVM9kAAABAT0aSDQAAACQZSTYAAACQZCTZAAAAQJKRZAMAAKBP27FjR6CkpOS4jIyMCfn5+YULFy4cEq9dbW1tYMaMGccMGTKkeMiQIcXXXnttfqJj9trdRQAAANBzmdnE7jy+c67Tj22fO3fuyPT0dFdTU7N2zZo1mTNnzhwTCoUaQ6FQc3S7b33rWyOamprStmzZUrl9+/Z+U6dOHTdq1KiWq666amdX42MmGwAAAH1WOBxOKysrG3zzzTdvy83NbS8tLa2fMmXKniVLluTFtn366adzr7/++prs7Oz2448/fu+sWbNqly5dOjSRcUmyAQAA0GdVVlYOCAQCKioqaomUFRUVNVZVVWXEa9/e3v73751zeuutt+K2OxiSbAAAAPRZdXV1gaysrLbostzc3Lb6+vpAbNtJkyaFb7755uG7du1Ke/311wc8+OCDQ5ubmxPKl0myAQAA0GdlZ2e3NTQ07JPzhsPh/RJvSbr77ru3Dhw4sH3s2LGF06dPH3Puued+FAwG9yYyLkk2AAAA+qzCwsKW1tZWq6ysHBApW7duXUZBQUFTbNtgMNj2+OOPV9fW1q59++2317e3t9tJJ53UkMi4JNkAAADos3JyctpLS0t3z5s3Lz8cDqetXLnyqPLy8kFz5szZb8eQ9evXD6ipqQm0trbqoYceylm2bNnQn/70p+8nMi5JNgAAAPq0xYsXb2lqakoLBoPFs2fPHr1gwYKtoVCouaysLCszM3NCpN0LL7yQWVRU9Nns7OwJP/nJTz5zzz33VMdu89dZ7JMNAACApOvKPtbdLRgMtpWXl78TWz5t2rT6xsbGVyPv586du2vu3Lm7kjEmM9kAAABAkjGTnWLjxo3Thg0bUh0GErB69WpNnjw51WGgi7huvRfXrvfi2uFIxEw2AAAAkGQk2QAAAECSkWQDAADgUDnnXKpjOKz8823vqJ4kGwAAAIfEzPbs3bu3f6rjOJz27t3b38z2dFTPjY8ptnHjxnoz487H3mmopNpUB4Eu47r1Xly73otr13eMilfY1tb2u+3bt185atSoPWlpaX1+Sru9vd22b9+e1dbWdntHbUiyU2+Dcy6U6iDQdWb2Mteu9+G69V5cu96La9f3tbe33xUOh0+urKz8sqRAquM5DNokPdfe3n5XRw1IsgEAAHBIJk6cuFfS7FTH0ZOwJhsAAABIMpLs1Ls71QEgYVy73onr1ntx7Xovrh2OOHakbbcCAAAAdDdmsgEAAIAkI8kGAAAAkowkO0XMLGBmC8zsQzOrM7NHzGxoquPCvszsVjNbb2ZhM9tuZovMbEhU/cVm1m5m9VGv36cyZkhmdq+ZfRxzXb4d0+YiM3vHzBrN7G9mNjFV8eIT/uct+ro1mZkzs5PNbLL/fXT986mO+UhlZueZ2V/9/x9b49RP869nk5m9bmZfjakfY2blZtZgZu+Z2fcPX/RA9yPJTp3rJf2TpC9I+oxfdn/qwkEH2iRdKClPUrG8a/W7mDabnHNZUa/zD3eQiOu+mOvyX5EKM/uypLskXSFpsKRHJP3ZzHJSFCt8zrnPRl83Sb+S9IZz7hW/SVvMdT0lheEe6XZJ+i9JV8dWmNloSY9KullSrv/1MTM7xq8PSPqTpDclDZN0jqTrzOwbhyNw4HAgyU6dyyTd6pzb5JzbI+nfJE2L/AeEnsE5N88596pz7mPn3IeS7pA0OcVh4dBdKulR59xK51yLpAWSWiSdm9qwEM3M+kmaI+m3qY4F+3POrXDO/V7SpjjVsyVVOOcecM7tdc4tk/SKPtlHeZK8Jwf+0DnX6P8S9VtJ3zocsQOHA0l2CphZrqSRkioiZc65dySFJRWlKi50yhRJ62LKRphZjZm9a2Z/MLNjUxEY9vM1M/vIzDb6S7OyouqKte/nz0l61S9HzzFd3izo0qiygP9ZqzGzJ82Ma9Yz7fMZ872iTz5jxZI2OufqO6gHej2S7NSI/El6T0z57qg69DBm9jV5M6BXRXfrSd0AAAK0SURBVBU/K6lQUr6kz0lqlvSUmR11+CNElP+UVCBpqLzZ6dMkLYqqzxafv97gcknLnXO7/fdVkk6SdKy867tO0jNmlp+i+NCxg33G+AyizyPJTo06/2tuTPkgebPZ6GHM7OvykrRzotaGyl/us9E51+6cq5GXhOdL+mKKQoUk51yFc26Hf13WS7pG0kwzG+A3qROfvx7NzI6T95ejhZEy51yNc26tc67VObfbOfdDSR9JOiNVcaJDB/uM8RlEn0eSnQL+rMxWSSdHyvybRHK0/1IEpJiZXSJvreDZzrlVB2nu/Jd1e2Doinb/a+S6rNW+nz+TN0O69jDHhY5dLmmtc+5vB2nXLj5vPdE+nzHfBH3yGVsraVzMX/2i64FejyQ7de6Wdyf1sf6OBrdKWuGc25zasBDNzL4n6TZJpc65/x+n/kwz+4x5hki6U1KtpDWHOVRE8bcWG+R/P1bSLyU97pxr9psskjTDzKaYWbqk70saKOmxlASMffjX5GJFzWL75af7276lmVmWmd0oKShpxeGPEv5WtAMlpfvvB/ovk7eOPmRm55tZfzM7X9JESff53Z+VtEXSL8wsw8xOkveLFTe5os8gyU6dW+RtX/SSpG2SAvK2ikPP8ht5f2FYFb03b1T9ZEkvSqqXtF7eVn8lMTfz4PD7lqRNZtYgaaW8X3ouiVQ6556T9G15yfYeSf8s6R+dc/ypumeYISlD0rKY8mJJT8tbarBJ3rKsEufcu4c3PPi+KalJ3i85Af/7Jkmj/Jv5Z0j6kbwlID+SdG5kIsk51ybpbEnjJe2U9GdJC5xzfzjM5wB0G/NuqgcAAACQLMxkAwAAAElGkg0AAAAkGUk2AAAAkGQk2QAAAECSkWQDAAAASUaSDQAAACQZSTYAAACQZCTZAAAAQJKRZAMAAABJ9n/R7NWL47IFnAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 720x144 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtkAAAC2CAYAAAAMcJE4AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3de3hV1bX38e9IMJAYCBAwJa2gCIgWEpFt67EWqUCD9XLQ0nrBilKqvXt7e2qxtVZL1WI91WNPqQr1hi1a9WirDTQKtVapGmuI2IDKTcGg4ZYrgSTj/WOtXbebhFzYYZPk93me/SR7zrnWGiuL6NgzY81l7o6IiIiIiCROSrIDEBERERHpbpRki4iIiIgkmJJsEREREZEEU5ItIiIiIpJgSrJFRERERBJMSbaIiIiISIIpyRYRERERSTAl2SIiXYCZpZjZlWZWZma7zOwdM/uFmR3axu1zzGx+uN1uM9toZrebWf/Ojl1EpCcyPYxGROTgZ2a3A98FHgf+DBwDfAf4GzDZ3Zv2se1hwEtALvAb4HVgDHAZsAr4jLvXduoJiIj0ML2SHYCIiOybmX2SIKF+zN2/GNO+DrgDOA94aB+7mAMMAy5w99/FbP9CuN1VwE87IXQRkR5LM9kiIgc5M/spcC0wwd3/FtPeB9gK/NXdv7CP7UuAkcChHvMffTNLAWqAze5+VGfFLyLSE6kmW0Tk4HcC0ERQ8vFv7r4LeC3s35fewC6Pm1UJS0zqgOFmNihx4YqIiJJsEZGDXy5Q4e71zfRtAgaZWdo+tl8FDDCz42Ibw/cDwrdDExKpiIgASrJFRLqCDKC5BBtgV8yYlvySYCb8YTP7gpkNNbPTgMXAnjZsLyIi7aQkW0Tk4FdLUPLRnD4xY5oV1nGfB/QFngI2AH8ElgF/CodVJiRSEREBtLqIiEhXsBk41sx6N1My8nGCUpLd+9qBuz9iZo8BYwmS7dXu/r6ZvQQ0AG91RuAiIj2VZrJFRA5+LxP89/pTsY3h6iLHAa+0ZSfu3ujur7n738IE+2PAOILVSbROtohIAinJFhE5+C0GHLgirv1rBLXUi6INZnaUmY1ubYfh8n13AKnA3MSFKiIioHWyRUS6BDP7H+DbBE98fJrgiY/fBf4OnBp94qOZrQeGubvFbJtJsPzf48A6IAs4HxgPXOvuPztwZyIi0jOoJltEpGu4AlgPXAqcDlQA/wNct69Hqod2AyuBC4AhBDdJvgxMdfclnRWwiEhPpplsEREREZEE00y2iIiIiOyX4uLitF69et0NnExwr0d31wg839DQ8LXx48c3u7qTkmwRERER2S8pKSnf6Nev32eGDRu2IyUlpduXSTQ1NdmGDRtO3rFjxzeA25sbo9VFRERERGS/pKamXpKbm1vTExJsgJSUFM/Nza1OTU29uMUxBy4cEREREemO3D0rLS1tT7LjOJDS0tL2uHtWS/1KskVERERkf5mZtT6qGwnPt8VcWjXZSda/f38fMWJEssOQVtTU1HDooYcmOwxpA12rrkHXqWvQdZJ4xcXFFe4+uDP2nZqaOn7kyJF1jY2Ndvjhh9c//PDD6wYNGtSYqP3fcccd2a+88sqh999//8arrroqNzMzs/GGG27Ykqj9x1OSnWQ5OTm88kqbnogsSbR8+XImTpyY7DCkDXStugZdp65B10nimdmGztp37969m8rKyt4AOOecc46YN2/e4FtuuaW8s47X2VQuIiIiIiIHlRNPPLFm06ZNadH3P/rRj3LGjBlzzKhRo4698sorc6Ptd955Z/aoUaOOPfroo4+dNm3akQAPPfRQVl5e3uhjjjnm2JNOOmnUO++8k5RJZc1ki4iIiMhBo6GhgWXLlvX96le/WgHw2GOP9Xvrrbf6rFy58l/uzuTJk0f8+c9/zhw8eHDDrbfeOuTFF18sGzJkSMOWLVtSAaZMmVJ93nnnlaWkpHDbbbcNuuGGGz529913v3ugz0NJtoiIiIgkXX19fcro0aOP3bRpU9qYMWNqp02bVglQWFjY77nnnut37LHHHgtQW1ubUlZW1ufVV19NOfPMM7cPGTKkASAnJ6cRYN26dWnTpk37xAcffHDI7t27Uw4//PD6ZJyPkuwkW7NmDYm+G3fy5MkJ3Z8E5s6dm+wQ2mT8+PHJDiHpCgsLkx1Cj9bWf4OPPPJIJ0fSsyXivwVDhw5l7dq1+xwzfPjw/T6OCHxYk71169bUz3/+8yNuvvnmw374wx++7+5cccUV733ve9+riB3/05/+9DAz22td7m9/+9tDL7/88vIZM2bs/NOf/tT3hhtuyI0fcyCoJltEREREDhrZ2dmNd9xxx8Zf/epXOfX19XbaaadVPvDAA4N27tyZArBu3bpDNm3a1Gvq1KmVTz755MDy8vJUgGi5SFVVVerQoUP3ANx7773ZyToPzWSLiIiIyEHlM5/5TN0xxxxTd8899wz41re+tW3VqlV9TjjhhNEAGRkZTYsWLVoXiUR2XX311e999rOfHZ2SkuJjxoypffTRR9dfe+21m88///yjcnJydkcikZqNGzf2TsY5mHuPePrlQau5P3PsL5WL9GwqF5Fk07/Bg8OBug4qF+lZzKzY3SPx7SUlJevz8/MrmtumOyspKRmUn59/RHN9KhcREREREUkwJdkiIiIiIgmmJFtEREREJMGUZIuIiIiIJJiSbBERERGRBFOSLSIiIiKSYEqyRUREREQSTEm2iIiIiEiCKckWERERkW5ty5YtqVOmTDkqPT19XG5u7tj58+cPbG7cVVddldurV6/jMzIyxkVfb7zxRlpHjqnHqouIiIhIwq1du7ZTHzs6fPjw4raOnT179tC0tDQvLy8vWbFiRcb06dNHRCKR2kgksit+7Omnn779iSeeWLe/8SnJFhEREZFuq7KyMqWwsHBAcXHxqqysrKaCgoLqSZMm7Vy4cGF2JBLZ1FnHVbmIiIiIiHRbpaWlvVNTU8nLy6uPtuXl5dWWlZWlNzf+2WefzcrKyjpuxIgRn7zlllsGd/S4bUqyzexiM/PwNaqZ/okx/ZPbE4CZXWFm57Rnm7jtr485tptZg5ltMLMFZvbxfWxXFI7/blz7iLj9tfQqMrNeZvZfZrbMzLaYWZWZFZvZJWZmHT0nEREREUmMqqqq1MzMzMbYtqysrMbq6urU+LEzZszYVlpaumrr1q2v/frXv15/6623DvnNb37TbP12a9o7k10FfKWZ9ovCvo64Auhwkh3jZOA/gM8BPwNOB54ys73O0cwOD8cBzIzrfifcT/R1cti+IK79O0Am8ANgJfA14GzgOWBhGIOIiIiIJFHfvn0ba2pqPpIPVlZW7pV4A4wfP37XEUccsadXr15MmTKl5mtf+9r7jz322ICOHLe9NdmPARea2XXu7gBmlg58EXgUuLgjQSTIP9y9Ifz+b2bWCNwNHA38K27sVwg+YDwNfMHMxrj76wDuXg+siA40s+jP6F13XxG7k7BvuLtvj2kuMrNs4Aozuz7cn4iIiIgkwdixY+sbGhqstLS099ixY+sBVq5cmT569Oi61rY1M8KUt93aO5P9ADCMD2d3IZi9TSVIsuMDO8XMngnLKGrMbImZjYnpXx/ub0ZMGca9Yd8IM3vAzNaZWZ2ZrTWzX5tZWz9NVIZfD2mm7yLgDYJZ9Oj7dnP3hrgEO+ploA/QoT8viIiIiEhi9OvXr6mgoGDHnDlzcisrK1OWLl16aFFRUf9Zs2ZtjR/74IMP9v/ggw9Sm5qaWLZsWcbdd9992JlnnrmjI8dtb5K9gaAcIrZk5CLgcaA6dqCZnQ48E7ZfCFwA9CWYZT48HHY2UA4s4cMyjBvDvlzgXYJEuAC4AZhEMPvcnNSwRjrdzMYDc4BVwOtxcZ1IMLt9v7u/CbxIMDu/V13OfjgF2Aa8n8B9ioiIiEgHLFiwYENdXV1KTk5O/syZM4fPmzdvYyQS2VVYWJiZkZExLjpu8eLFA0aOHDk2MzNz3KxZs4787ne/W/6d73xnr2S8LTqyhN/9wC/CGwYHAJOB05oZdzvwV3f/z2iDmS0D1gJXA1e4+z/NrB6oiC/FcPfnCBL66LYvAG8RJOnj3P2fcceLX+ewDDjD3Zvi2mcCTcCD4fv7gPnAFKBwn2feBmb2BYLymWvcfa9aHxEREZGeoD3rWHe2nJycxqKiorfj26dOnVpdW1v775zyj3/8436vjx3VkST7EeBO4EyCUo9yghnrCdEBZjYSOAr4WUxNM0AtwczxBFphZmnA/yOYKR9GUH4RdTQQn2SfCDQSzM4PA74PLDWzk9x9S7jP3sC5wLPuHl0XcTHBB4KL2M8k28zGAouAvwC37mPcpcCl+3MsERGRg8Hy5cuTHYLIQandSba7V5nZ/xGUjBwBLHL3prgV6w4Lvy4IX/E2tuFQNxGs4HED8ALB6iWfILj5sk8z44tjbnx8ycyeA94DriJIuAHOIph9f9zM+sdsuwSYZmb93L2SDjCzEcBS4E3gnH3NYrv7XcBd4XYdq6YXERE5CEycODHZIYgclDr6xMf7gacIZo3Pb6Y/WrvyA6Comf7dbTjGeQR10z+NNphZZlsDdPctZlYB5MU0R5fr+1X4ivdl4J62HiMmrqEEs/nbgNPcvbqVTURERESkG+tokv0X4GFgh7uvaqZ/NbAe+KS739zKvuqB5p64kwHsiWu7pK0BmtkQYBDwQfg+h+AGyieAXzazye8ISkbalWSH+y0KY53i7h0qjhcRERGR7qNDSXZYCtHcDHa0383sW8ATYW31w0AFkAOcBGx099vC4W8AnzWzMwjquyvcfT1BffRMMysluOHxnHDblnw6XBs7WpP9PYIa7flh/wyC8/1vd/9r/MZmdh/wX2Y23N3XtuHHgJllEJSaHE7wAWBoOKsdtcrdO/qQHhERERHpojo6k90qd3/azCYA1xLMDqcTJNErCG42jPoBwUNjHg7H3EfwUJvvAAbMDcc9TZDYv9TCIZ+PHjo8TjHwdXePjp8JvE3MiiVxFhLUbl8EXN+2syQXyA+//10z/Z+NiUtEREREeog2Jdnufi9wbytjlhMkxbFtLwJntLJdGUEyGt9eQVCXHS/+GNfThqTY3fNb6V8Tv++wvaG59rDvrZb6RERERKTnau/DaEREREREpBVKskVEREREEkxJtoiIiIh0a1u2bEmdMmXKUenp6eNyc3PHzp8/f2BLY59//vmMSCRydEZGxrjs7Oz8G2+88bCWxu5Lp934KCIiIiI91yOPPDK+M/f/pS99qc2PbZ89e/bQtLQ0Ly8vL1mxYkXG9OnTR0QikdpIJLIrdtx7773X66yzzho5d+7cdy6++OLtu3btsnXr1qV1JD4l2SIiIiLSbVVWVqYUFhYOKC4uXpWVldVUUFBQPWnSpJ0LFy7MjkQim2LHzp07N2fChAmV3/jGN7YBpKen+4ABA3Y1v+d9U7mIiIiIiHRbpaWlvVNTU8nLy6uPtuXl5dWWlZXt9TDEV1555dABAwY0jBs3bvTAgQPzTz311BFvvvlmh2aylWSLiIiISLdVVVWVmpmZ2RjblpWV1VhdXZ0aP7a8vDztD3/4Q/Yvf/nLje++++7KoUOH1p977rnDO3JclYuIiIiISLfVt2/fxpqamo9MLFdWVu6VeAP07t27qaCgYMcpp5xSC3DzzTdvHjJkyHFbt25Nzc7O3mv8vmgmW0RERES6rbFjx9Y3NDRYaWlp72jbypUr00ePHl0XP/aYY46pM/vwOYPR79293cdVki0iIiIi3Va/fv2aCgoKdsyZMye3srIyZenSpYcWFRX1nzVr1tb4sbNmzapYsmRJ/xdeeCG9vr7e5syZk3v88cdXDxo0qF2z2KAkW0RERES6uQULFmyoq6tLycnJyZ85c+bwefPmbYxEIrsKCwszMzIyxkXHnXXWWVXXXnvtpmnTpo0cPHhw/rp163ovXrx4bUeOaR2Z/pbEMbOEX4DJkycnepfShYwf36nLkoq0Sv8GDw4H6joMH96he8KkizKzYnePxLeXlJSsz8/Pr0hGTMlUUlIyKD8//4jm+jSTLSIiIiKSYEqyRUREREQSTEm2iIiIiEiCaZ3sJBs1ahSrV69OdhjSiuXLlzNx4sRkhyFtoGvVNeg6dQ26TiIdp5lsEREREZEEU5ItIiIiIpJgSrJFRERERBJMSbaIiIiISIIpyRYRERERSTAl2SIiIiLSrW3ZsiV1ypQpR6Wnp4/Lzc0dO3/+/IHNjZswYcLIjIyMcdHXIYcccvyoUaOO7cgxtYSfiIiIiCTcNddcM74z93/zzTcXt3Xs7Nmzh6alpXl5eXnJihUrMqZPnz4iEonURiKRXbHjnnvuuTdj33/qU586esKECZUdiU9JtoiIiIh0W5WVlSmFhYUDiouLV2VlZTUVFBRUT5o0aefChQuzI5HIppa2W716dVpxcXHm/fffv64jx1W5iIiIiIh0W6Wlpb1TU1PJy8urj7bl5eXVlpWVpe9ru7vvvjt7/Pjx1aNHj97dkeMqyRYRERGRbquqqio1MzOzMbYtKyursbq6OnVf2z388MPZF154YUVHj6tykSRbs2YNZpbsMKQbmTx5crJDSLq5c+cmOwRpg/Zep/HjO7W8U1pQWFiY7BASoiv/+2kt9uHDhx+gSLqmvn37NtbU1HxkYrmysnKvxDvWkiVLMisqKg6ZOXPm9o4eVzPZIiIiItJtjR07tr6hocFKS0t7R9tWrlyZPnr06LqWtvntb3+bXVBQsD0rK6upo8dVki0iIiIi3Va/fv2aCgoKdsyZMye3srIyZenSpYcWFRX1nzVr1tbmxldXV9tTTz014JJLLmm2v62UZIuIiIhIt7ZgwYINdXV1KTk5OfkzZ84cPm/evI2RSGRXYWFhZkZGxrjYsYsWLRrQt2/fxjPOOKNqf45p7r5/Uct+MTNdAEko1WRLd9WVa2ol+bryv5+DqSbbzIrdPRLfXlJSsj4/P7/DNwl2VSUlJYPy8/OPaK5PM9kiIiIiIgmmJFtEREREJMGUZIuIiIiIJJiSbBERERGRBFOSLSIiIiKSYEqyRUREREQSTEm2iIiIiEiCKckWEREREUkwJdkiIiIiIgmmJFtEREREurUtW7akTpky5aj09PRxubm5Y+fPnz+wuXF1dXV2wQUXDM3Ozs7Pyso67tRTTx2xbt26QzpyzF77F7KIiIiIyN6mTJnSqc+y/8tf/lLc1rGzZ88empaW5uXl5SUrVqzImD59+ohIJFIbiUR2xY6bO3fuYcXFxZmvvfbaquzs7MYLLrjgiMsuu2zo0qVL325vfF1mJtvMnjCzbWbWu4X+vmZWY2b3HqB4epmZm9n1B+J4IiIiItJ+lZWVKYWFhQNuuummTVlZWU0FBQXVkyZN2rlw4cLs+LHr1q3r/bnPfa7y8MMPb8jIyPDzzjtv25o1a9I7ctwuk2QD9wEDgDNa6J8OZITjREREREQoLS3tnZqaSl5eXn20LS8vr7asrGyv5Pmyyy6reOmllzLXr19/SFVVVcqiRYsGnnrqqTs7ctyuVC7yJ2ArcBHwaDP9FwEbgeUHMCYREREROYhVVVWlZmZmNsa2ZWVlNVZXV6fGjx0zZsyuj3/84/VHHnlkXmpqKiNHjqy75557VnfkuF1mJtvddwO/B04zs0GxfWY2FDgFeMDd3cxGmdmDZrbezOrM7G0z+5WZ9Y/bLjrm02b2Yji2zMymhv3fM7MNZrbTzB6PP+6Hu7EfmdkmM9tlZn81s7Gd9GMQERERkXbo27dvY01NzUdy3srKyr0Sb4CLL7542K5du1LKy8tfq6qqevWMM87YPmXKlJEdOW6XSbJD9wGHAOfGtV8IGHB/+P7jwAbgcqAAmBt+/VMz+xwA/Ba4Czgb2AY8Zma3AZ8BvglcBUwG7mhm+1nA54FvAZcAucCz8Qm9iIiIiBx4Y8eOrW9oaLDS0tJ/39e3cuXK9NGjR9fFj/3Xv/6VMXPmzK05OTmN6enp/v3vf//90tLSQ9977712V390pXIR3P1lM3uDoDTkVzFdXwFedPc14bhlwLJop5m9AKwFlpnZWHcvjdm2H3Cau78Qjn0fKAamAmPcvSlszwcuM7OUaFuoN1Dg7rXhuJeA1QQJ/k8Sd/YiIiIi0l79+vVrKigo2DFnzpzcRYsWbVixYkV6UVFR/2XLlpXFj83Pz6954IEHsk877bSqzMzMpltvvXXw4MGD9wwZMqShvcftUkl26H7gZjMb5e5rzOxTwGjgG9EB4Qok3yOY4R4G9InZ/mggNsmujCbYoegP/C9xyXQZkAYcBpTHtP8pmmADuPvbZvYy8B8tnYCZXQpc2uqZioiIiLRi+fLlyQ7hoLdgwYINM2bMOCInJye/f//+DfPmzdsYiUR2FRYWZp5zzjkja2tr/wlw5513vnPppZcOHTly5Jg9e/bYqFGj6hYvXvxWR47ZFZPsB4GfEcxm/zD8Wg8sjhnzc4Kk+3pgBVBFkGw/wkcTboDtce93t9Iev/2WZmLcAhzV0gm4+10E5SmYmbc0TkRERKQ1EydOTHYIzWrPOtadLScnp7GoqGivta6nTp1aHU2wAT72sY81Pvnkk+sSccyuVpONu28CioALzSyNoD77SXePTYrPAxa6+8/c/Vl3fxno0PIrbZDTQtumTjqeiIiIiBzkulySHbqPYGb6JmAQH97wGJUO7Ilru6STYjnDzDKib8zsKOAE4MVOOp6IiIiIHOS6YrkIwONAJXAl8D5QGNe/BJgV3iT5NvAl4FOdFEs9sMTMbiVI7m8kKDW5vZOOJyIiIiIHuS45k+3udQT11QY85O7xd3x+E3iKYKZ7MUEd9YxOCmchsBT4X+BeYDMwyd13dNLxREREROQg11VnsnH32cDsFvo+AL7cTJfFjbuwmW0b4seF7fcA9+xj3I1tClxEREREur0uOZMtIiIiInIwU5ItIiIiIpJgSrJFRERERBJMSbaIiIiISIIpyRYRERGRbm3Lli2pU6ZMOSo9PX1cbm7u2Pnz5w9sblxFRUXqOeecc8TAgQPzBw4cmH/VVVfldvSYXXZ1ERERERE5eJnZ+M7cv7u3+bHts2fPHpqWlubl5eUlK1asyJg+ffqISCRSG4lEdsWO+/rXv354XV1dyoYNG0o3b97ca/LkyaOGDRtWf/nll29tb3yayRYRERGRbquysjKlsLBwwE033bQpKyurqaCgoHrSpEk7Fy5cmB0/9plnnsm65ppryvv27dt09NFH754xY0bF/fffP6gjx1WSLSIiIiLdVmlpae/U1FTy8vLqo215eXm1ZWVl6c2Nb2pq+vf37s6bb77Z7LjWKMkWERERkW6rqqoqNTMzszG2LSsrq7G6ujo1fuyECRMqb7rppiHbt29Pef3113s/9NBDg3bt2tWhfFlJtoiIiIh0W3379m2sqan5SM5bWVm5V+INcNddd23s06dP08iRI8dOmzZtxNlnn70tJydnd0eOqyRbRERERLqtsWPH1jc0NFhpaWnvaNvKlSvTR48eXRc/Nicnp/HJJ59cV1FRUfLWW2+tampqsuOOO66mI8dVki0iIiIi3Va/fv2aCgoKdsyZMye3srIyZenSpYcWFRX1nzVr1l4rhqxatap3eXl5akNDAw8//HC/RYsWDfrxj3/8XkeOqyRbRERERLq1BQsWbKirq0vJycnJnzlz5vB58+ZtjEQiuwoLCzMzMjLGRce9+OKLGXl5eZ/s27fvuOuuu+4T99xzz7r4Zf7aSutki4iIiEjCtWcd686Wk5PTWFRU9HZ8+9SpU6tra2v/GX0/e/bs7bNnz96eiGNqJltEREREJME0k51ko0aNYvXq1ckOQ1qxfPlyJk6cmOwwpA10rboGXaeuQddJpOM0ky0iIiIikmBKskVEREREEkxJtoiIiIjsL3f3ZMdwQIXn29RSv5JsEREREdkvZrZz9+7dhyQ7jgNp9+7dh5jZzpb6deNjkq1Zs6bazHTn48FvEFCR7CCkTXStugZdp65B10niDWuusbGx8bebN2/+zrBhw3ampKR0+yntpqYm27x5c2ZjY+MdLY1Rkp18q909kuwgZN/M7BVdp65B16pr0HXqGnSdpK2ampp+XVlZeXxpaenJQGqy4zkAGoHnm5qaft3SACXZIiIiIrJfxo8fvxuYmew4DiaqyRYRERERSTAl2cl3V7IDkDbRdeo6dK26Bl2nrkHXSaSDrKcttyIiIiIi0tk0ky0iIiIikmBKskVEREREEkxJdpKYWaqZzTOzD8ysysweNbNByY6rJzOzW8xslZlVmtlmM7vbzAbG9F9sZk1mVh3z+l0yY+6JzOxeM9sTdx2+GTfmIjN728xqzewfZjY+WfH2ZOHvU+x1qjMzN7PjzWxi+H1s/wvJjrknMLPzzOxv4X/rGprpnxpeuzoze93MPh/XP8LMisysxszeNbOrD1z0Il2HkuzkuQb4T+DTwCfCtgeSF44QrHl5IZAN5BNcl9/GjVnr7pkxr/MPdJACwH1x1+F/ox1mdjLwa+AbwADgUeBpM+uXpFh7LHf/ZOx1Am4D3nD3V8MhjXHX8aQkhtuTbAf+F7givsPMhgOPATcBWeHXx83siLA/Ffgj8C9gMHAW8H0zO/dABC7SlSjJTp5LgVvcfa277wT+C5ga/Q+ZHHjuPsfd/+nue9z9A+BOYGKSw5L2+xrwmLsvdfd6YB5QD5yd3LB6NjPrBcwCfpPsWHo6d1/i7r8D1jbTPRModvcH3X23uy8CXuXD9Y8nEDzx7wfuXht+YPoN8PUDEbtIV6IkOwnMLAsYChRH29z9baASyEtWXLKXScDKuLbDzazczN4xs9+b2ZHJCEz4opltM7M1YdlVZkxfPh/93XLgn2G7JM80gpnR+2PaUsPfpXIze8rMdI2S7yO/P6FX+fD3Jx9Y4+7VLfSLSEhJdnJE/2y9M659R0yfJJGZfZFgRvTymObngLFALnACsAv4i5kdeuAj7NH+BxgNDCKYnT4FuDumvy/63ToYXQYsdvcd4fsy4DjgSILruRJ41sxykxSfBFr7/dHvl0gbKclOjqrwa1Zce3+C2WxJIjP7EkHSdlZM7Shhac8ad29y97KLxaEAAAqsSURBVHKCJDwXODFJofZI7l7s7lvC67AKuBKYbma9wyFV6HfroGJmRxH8ZWh+tM3dy929xN0b3H2Hu/8A2Aaclqw4BWj990e/XyJtpCQ7CcKZnI3A8dG28GaTfuxdniAHkJldQlBfeKa7L2tluIcv6/TAZF+awq/R61DCR3+3jGDGtOQAxyUfugwocfd/tDKuCf0+JdtHfn9C4/jw96cEGBX3F7zYfhEJKclOnrsI7sg+Mlz14BZgibuvT25YPZeZfRe4FShw978303+6mX3CAgOBXwEVwIoDHGqPFi4/1j/8fiTwC+BJd98VDrkbOMfMJplZGnA10Ad4PCkB93DhNbiYmFnssP3UcCm4FDPLNLPrgRxgyYGPsmcJl5DtA6SF7/uELyOomY+Y2flmdoiZnQ+MB+4LN38O2AD8zMzSzew4gg9RuqFVJI6S7OS5mWAZpJeBTUAqwfJxkjy3E/w1YVns2r0x/ROBl4BqYBXBUn9T4m4Aks73dWCtmdUASwk+5FwS7XT354FvEiTbO4EvA19wd/05OznOAdKBRXHt+cAzBOUHawnKrqa4+zsHNrwe6StAHcEHmtTw+zpgWHgT/jnADwlKQH4InB2dAHL3RuBMYAywFXgamOfuvz/A5yBy0LPgxnsREREREUkUzWSLiIiIiCSYkmwRERERkQRTki0iIiIikmBKskVEREREEkxJtoiIiIhIginJFhERERFJMCXZItLpzOwiM9sQ8/5fZvaNBB/jP8zsH2ZWY2YePiSjtfG/N7N3zWy3mVWa2ctmdqOZDUlkbN2NmU0Mf8aT2zB2vZnd24mxHGdm14cPiIrv8/AhN7Fts8zszfCa7+iMGM1suZktT9T+RKRr6pXsAESkRxgPFAOYWSYwKvo+gRYQPFDjTKAWWNPSQDO7GpgHLCN42MZaIBM4CbgUiACnJTg+6RzHAT8GHgS2xfX9B/Bu9I2Z5RI8bXcRwQOMok8JPZvgwSsiIgmjJFtEDoTxwJ9jvm8CViZq52aWAhwNzHX3Z1sZ+zmCBPt2d78yrvtpM7sJ+FKiYpPkcfcVcU0jCZ5weF/4ZNDouH8e0MBEpEdQuYiIdKowAT4OeDVsGg+84e67Wt7qI9v3M7M7zWyzmdWb2Wozu9LMLOy/GGgk+O/Zj8ISgfX72OX3gYrw617cvcbd742LIcPMbjGzdWGZwTozuzY8t+iYaAnFWWG8FWb2gZk9aGb923NOcfubZma/MbNtZrbdzP7bzFLN7AQzez4sj1llZgXN/OxOMbNnzKwqHLfEzMbEjSkwsxfMbKeZVYexXLePn1+LzOzysPRil5m9YmafbWHckWa2KPz51JvZa2Z2dtyY68PzH2lmT4WxbTCz66I/9/Da/zbc5M1wvJvZEWH/v8tFwnKQ5eHYZ8K+e8O+vcpF2hJjOO48MysLx6xqboyI9EyayRaRThEmusNimp6OySExMw+/PdLd17ewjxTgKeB44DqgFDgduA0YDMwJ+08GnicoGbkHqG9hf72AU4DH3H13G8+jF7AEOBa4MYzhROBHwEDg6rhNbgf+BFxAMLv+c4IPATPbcU6xfgk8BpwLTCAob+kFTCaYkd8Utj1mZsPcvSI8zunAE+GxLgz39X3gb2aW5+7vmNlw4EngD8ANwG6C2d7hbfnZxP2cvhrGei+wGBgB/A7oGzfucOAfwPvAlcAH4bk9ambT3P3JuF0/TpBI/zdBKdBPgHfCtqeAn4bn/yU+LA15r5kQbyQoUboD+BbBh74PWjiXNsVoQU36Q2EcVxNcv9uBQ4DVLfyoRKSncHe99NJLr4S/CJLS4wiSx1Xh98cR1L5eGfM+bR/7OANw4OK49mgiPSh83yscd30rMeWE425qpq9X7Cum/SvhNhPixl9LkJQeFr6fGI67L27cnQS1v9bOc4rub2HcuFfD9pNj2vLCtpkxbW8Bz8Rt249gFv+X4fvp4Xb92nlto7FNDt+nECS+hXHjzg3H3RvTtoAgac2OG/sX4LWY99eH214SN64UWBrz/uJw3Ihm4vzIvwmCDyYOTIwbt76DMf4deANIiWn7dHiM5cn+HdRLL72S+1K5iIh0Cnd/w91fAw4nSDheA2oIZjYfcffXwte+ZpQnENRv/y6u/UEgjeDGtvawZhvNPgbsiX2FM9gAU4ENwAtm1iv6ApYSzFieGLe7p+LelwK9CRL8jpzTn+PelwE1HlNTHLZB8LPGzEYCRwGL4mKuBV4MYwB4LTzf35vZdDM7jI75RPh6OK79UaAhrm0q8DSwMy62JUC+mfWLGx//83wdGNrBONuq1RjNLBU4AfiDuzdFN3T3fxAk7SLSwynJFpGEC2uGo4nJZ4AXw+8/S1DeUB72N5v0xhgIbHP3+PKP8pj+9qggmFWOT9IqCBKmE4C74/oOIyh72RP3einsz44bH7/CRTT2PjExt+ectse93w3siG2I+aASPUY0WV7QTNxnRGN297eAAoL/FzxAcF3+YWan0D7RJQ+3xMXVAGyNG3sYcFEzcc0L+9vy8+xD52pLjIMIPmRtaWb75tpEpIdRTbaIdIZnCGqfox4IX1F7wq+f48Ob0ZqzDRhoZmlxM94fC7/GJ3D75O4NZvYcMCV2n2Ey+AqAmZ0Rt9lWYB3w5RZ2u749MZDgc2pBdB8/AIqa6f/3cd19GbDMzHoTfCC6AXjKzI7wsL67DaI10DmxjeEHq/ikeSvwN+CWFva1uY3H7ExtibGB4N9xTjP9OQR//RCRHkxJtoh0hssIykLOBaYB54ftTxPcGLYkfN/azWF/Bb5HcFPbopj2GQSJYvwSbW3xc4La2lsIasNbUwh8Eah297LWBrdBZ5xTvNUEyf8n3f3mtmwQzqw/a8E65k8ARxLM8LfFuwQ12V8GFsa0f5G9/z9TSFASs8rd69q4/32J/kUgPQH7impTjGb2MjDdzK6PloyY2aeBI1CSLdLjKckWkYRz99UAZvYj4Cl3f8XMjib4E/sCdy/f5w4+9GeCVUPmm9lgghsovwDMJrh5sa1JYGxsz5jZNcDNZpYH3E8wU92H4CE55xHUjkdXP4k+uOQZM/sFUEJQO30UcBYwzd1r2xFCws8pnru7mX0LeMLM0ghqpSsIZlhPAja6+21m9nWC+uynCZLkQQSz35sJap/berwmM/sJcI+Z/Rb4PcHqIj9g74e8XEdQavOcmd1J8GFgADAGGO7us9p5um+EX79lZvcRzC6vbKXWvzVtjfHHBLX5/2dmvyFYXeQnfFj6IyI9mJJsEekUYXI3iWAFCwieoPjPdiTY0eTtdOBnBMvPZRMkPFcRLBfXIe7+czP7O3B5uO/BBLXaqwmWn5vv7o3h2D0WrEF9DcHTII8kSMLfJrgpr13JXGedUzPHedrMJhCsgnIPwUxvOcFM+eJwWAnBdbmJoA55G8EHgBntnWV29wXhLPhVBH+5eJ3gA8uDceM2mlmEYPWQ6M9+azj+vg6cZ0m4FvalwNcI6suPZD9uPmxrjO5eZGYzwnGPEazocgXBvysR6eGiS0qJiIiIiEiCaHUREREREZEEU5ItIiIiIpJgSrJFRERERBJMSbaIiIiISIIpyRYRERERSTAl2SIiIiIiCaYkW0REREQkwZRki4iIiIgkmJJsEREREZEE+//q6dQyeYvZLwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 720x144 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "for precision in 0.95, 0.9:\n",
    "    plt.figure(figsize=(10, 2))\n",
    "    colors = ['#DDDDDD', '#AAAAAA', '#777777', '#444444', '#000000']\n",
    "    recalls = [0.5, 0.6, 0.7, 0.8, 0.9]\n",
    "    for y, bins in zip((0, 1), (vamb_bins, metabat_bins)):\n",
    "        for color, recall in zip(colors, recalls):\n",
    "            plt.barh(y, bins.counters[1][(recall, precision)], color=color)\n",
    "\n",
    "    plt.title(str(precision), fontsize=18)\n",
    "    plt.yticks([0, 1], ['Vamb', 'MetaBAT2'], fontsize=16)\n",
    "    plt.xticks([i*25 for i in range(5)], fontsize=13)\n",
    "    plt.legend([str(i) for i in recalls], bbox_to_anchor=(1, 1.1), title='Recall', fontsize=12)\n",
    "    \n",
    "    if precision == 0.9:\n",
    "        plt.xlabel('# of Genomes Identified', fontsize=16)\n",
    "    plt.gca().set_axisbelow(True)\n",
    "    plt.grid()"
   ]
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
